DATA SIZING

Data size

Data size is the amount of data to be loaded.


When designing the data migration event, the question of how big it is becomes significant. It is the biggest single determinant of the run times (see below).

Data size is normally expressed in bytes, kilobytes, gigabytes and terabytes as we go up the order of size. However, we can also measure data in terms of the number of records to be read and written. The number to be read is often far more than the number actually written, but reading takes time too. Complex data navigation can involve reading a score of records before a single record is written to the new system.

From our Legacy Data Store definition forms we will have captured the gross numbers of records in each data store. From our Data Mappings we will know the navigation involved and the consequent number of intermediary records we will be hitting. And from our System Retirement Policy and the new system definition we will know how many records we are expecting to load. From this information we will be able to calculate the size of the data load.

Hint

I say ‘we‘; of course I mean the technical experts. It is rare these days that a formal access path calculation is performed. We work on good guesses to assess the time it will take, but all the factors above are fed in to that guess.


It is not only the run times that need this input — any Transient Data Stores created at go live, the new system and the extract, transform and load process will all use space on our hardware. Temporary tables will be created, temporary indexes need space etc. We need to know both the size in bytes and the number of records to pass on to the programmers and database designers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.45.185