Data migration strategies-2

Application development
Data analysis
Project management
Hi All I am embarking on a Data Migration project which involves migrating data from one I-Series database application to a different I-series database application. I guess as with all migrations the data does not excatly match in both appplications. I have identified that i need to populate Excel files with all the relevant data from our legacy system, once this has been done then the users of our application will be adding data to these excel files which is required by our new application. Once these Excel files have been fully populated i then need to upload to the new application, I guess my question is have any of you gone through this kind of process, and if so how did you manage it. One problem i can see is when i bring the data down to excel from legacy system, Excel does not keep the formats of each field, this would mean i need to specify the format for eac field again in Excel before uploading to new application. Thanks in advance for any help

Answer Wiki

Thanks. We'll let you know when a new response is added.

We have brought on live data from 9 seperate companies onto our iseries. We loaded the data into excel and then used MS access to link the data to the iseries files using ODBC drivers. This meant that the field names are cross referenced within access. The only difficulty we did have was with empty fields, so we needed to put a . (dot) or a 0 into each field, then remove it using SQL/400 when on the iseries

Discuss This Question: 5  Replies

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • TonyRa
    When you export or import data to and or from the iSeries, a file is created that ends with the extension .FDF. this is a definition of the Database that exists on the iSeries. Another file is also created if you request it that is either a .dtf from iSeries) or a .dtt (To iSeries) file. These files make sure that the data in the file that your are importing/exporting to/from the iSeries makes sure that the data is in the fight format for the database that you are using. I sould suggest looking into iSeries Client Access Expres as it explains this process in more detail there.
    10 pointsBadges:
  • Rexzel
    That is also my problem in putting data to excel. Most especially if the data type is date. It becomes a number if you will put to excel. What I did is converting all date data type to string or text before putting data to excel and it works. Im not sure if this strategy is also applicable to your problem.
    0 pointsBadges:
  • Cscottmitchell
    I'm curious why the database format you're moving your data to needs to be in excel. Excel is not a database, it's a spreadsheet. I personally have never experienced a excel application which , when analyzed, could not have actually used a real database as the backend. Sorry, I know this doesn't solve your problem but I've been designing and implementing applications for over 15 years and I think your application designers should reconsider. My $0.02.
    0 pointsBadges:
  • Ddune3566
    My 2 cents worth. 1. Excell has a row limit. If you exceed it users will not be able to add data to the worksheet. 2. I have done several splits/ merges with other companies as many odf you probably have. I would move the new files into a test area. I would then have the users create cross references in excell that have the knowlege of what extra data that is needed in the current systems files. I would upload the cross references. You then write a program that reads the new files and adds the new data, from cross references, into a copy of your current systems file in a test area. Test the data and the system in test. IF all goes well copy add the new file to your current system. I know i have simplified this but this is how I have done this in the past. Hope this helps
    0 pointsBadges:
  • Datameister
    I highly recommend that you make at least one change to your strategy: instead of following the route: -old datastructure - production platform (iSeries) -data entry tool and database (Excel or hopefully something better) -data entry of new and changed data -new datastructure (back to iSeries) that you take this route: - old datastructure - production platform (iSeries) -programmatic ETL of data - new datastructure - production platform (iSeries) in a temporary workspace - transfer only necessary data to separate database for data entry - data entry tool and database (Excel or hopefully something better) - transfer the new data from the data entry tool database to the respective columns in the new struction - new datastructure - production platform (iSeries) in a temporary workspace - now contains data from old datastructure and new/ transformed data through data-entry. - procedure to move the new datastructure, when verified, from temporary to production space. This looks like more steps and more work, but if you add the I.T. work and the Data Entry work, and the time you will save by preserving as much of the original formatting and data as possible, and using programs to do as much of the transformation as possible, it will, in the end, save time. It is also: -saveable and repeatable -safer -provides a guide and tools for future such migrations My experience in database migration is with Oracle, but, except for the specific tools involved, the basic steps should be similar. 1. MAPPTING Map the current data structure to the new data structure. a. Specify the correspondence between old and new tables b. Specify the columns to be transfered i. Columns that remain the same, data and format ` ii. Columns that whose data remains the same, but format differs iii. Columns whose data and format is altered iv. Columns that are new c. The first 2 (b.i. and b.ii.) can all be transfered to the new structure programmatically. Of the 2nd 2 (b.iii. and b.iv.) you need to categorize them into: i. Those which can be altered or assigned programmatically (i.e. there is enough data for a program to figure out what data should be added for each record.) ii. Those which can be partially assigned programmatically (i.e. there is enough data for a program to figure out what data should be added for either -MOST RECORDS or -MOST OF THE DATA in a record.) iii. The data which must be input by a user. 2. STRATEGY AND PREPARATION a. Strategy i. Decide how the migration is going to occur: A) How much time do you have: how much time can the production data be off-line. B) Work with your Data Administrator to find out how often and how easily you can get extracts of current data C) Save all your mapping, extracting, transormation and loading programs, scripts and settings so you can do any number of practice runs before actually loading the data into the new, live, production structure. D) CREATE NEW STRUCTURE IN TEMPORARY WORKSPACE Create the new data structure with production tools (in your case, an iSeries database) but in a temporary location. ii. Use your new structure in your temporary workspace on the steps below until all the kinks are worked out. This is essential as it gives you the luxury of "redoing" as often as needed, as well as the PROTECTION of production data against errors. 3. ETL -Extract, transform and load Use export/import tools along with programming tools (data modification or ETL tools) to take the data from current production and import it into the new structure including all the new/changed columns that can be changed/ inserted/loaded or partially filled as analyzed in MAPPING, 1.c.i and ii. a. Create/assign new data programmatically - Example of programmatic assigning (1.c.i): if there is a new field ZIP CODE, and the zip code is currently part of another field, your programming script or ETL tool can look for the zip code in the other field and insert it into the new field b. Alter/reformat new data programmatically - Example of programmatic altering (1.c.i): if there is a new format on a field, often exporting the old value (a number for example) and simply loading it into the new format (a different number datatype, for example) is sufficient. In those cases where the new format is not compatible with the old data, programmatic alteration of the data can be used. c. Partially alter/assign new data - Example of partial programmatic loading (1.c.ii): If there is NOTHING in the CURRENT data that you ETL tool or data alteration program can use to figure out the new data, IN SOME CASES, you can still insert some data programmatically when: i. Much of the data is repeated. For example, if you have a new field "city" that you can't derive from existing data, but say 90% of your addresses are in "San Antonio", you can insert "San Antonio" into all the records, and make note of the field as one that requires user intervention. ii. Part, but not all, of the data can be assigned programmatically, or the program can make a good guess. Examples of these are: A) Incomplete data is either known or exists in the current database. Say the area code and PBX is known, but not the extension. The column could be populated with (XXX) XXX-, and a note made for the data-entry phase to fill these out. B) Addresses: Programs can parse old address columns to fit new address columns but it is a complex enough issue that it is unlikely that the transformation will be 100% accurate without spending far too much time on the parsing code. Again, that column would be added to the list of those requiring data-entry intervention. 4. Data-entry Once the programmatic transfer of data from the old datastructure to the new datastructure is complete, set up the tools necessary for the data entry of new/transformed data that could not be completely done programmatically (3.c. above) a. Load the data from your temporary workspace created in step 3 into temporary tables in whatever the best tool for data-entry you have is. If it is Excel, so be it, but if you have forms or tools that use iSeries databases on their back-end natively, that would be ideal. MS Access is also better, as it has ODBC capability that can access your tables in their native format directly. b. Create forms or format the spreadsheets so that the data-entry personnel can only edit/add the data that is required of them. c. Also provide as strict an edit (typing or formatting or otherwise creating edit rules for the columns to be entered) d. Once data entry has been completed, load the data back up into the iSeries columns in the new structure in your temporary work space. 5. Production Once the data in your temporary workspace is verified, it can be moved into production. Not that the best thing about this process is that all the steps completed thus far can be repeated, reprogrammed, revised etc. as many times as necessary to get the data right, BEFORE any production data is affected. Data formatting is NEVER LOST, as the data always remains mapped to a column in your new data structure in your temporary iSeries work space, whether it is in there or whether it is awaiting up load from a data-entry source. I hope this at least gives you an idea or two that will be helpful. If there's anything I've mentioned that you have questions on, please ask and I will be glad to clarify. If you have any IBM/iSeries/DB2-specific questions, those are best directed to others, as my experience is with UNIX/Oracle. Best wishes in your project.
    0 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: