Background

Data transformation is a set of techniques used to convert data from one format or structure to another format or structure. The following are some examples of transformation activities:

  • Data deduplication involves the identification of duplicates and their removal.
  • Key restructuring involves transforming any keys with built-in meanings to the generic keys.
  • Data cleansing involves extracting words and deleting out-of-date, inaccurate, and incomplete information from the source language without extracting the meaning or information to enhance the accuracy of the source data.
  • Data validation is a process of formulating rules or algorithms that help in validating different types of data against some known issues.
  • Format revisioning involves converting from one format to another.
  • Data derivation consists of creating a set of rules to generate more information from the data source.
  • Data aggregation involves searching, extracting, summarizing, and preserving important information in different types of reporting systems.
  • Data integration involves converting different data types and merging them into a common structure or schema.
  • Data filtering involves identifying information relevant to any particular user.
  • Data joining involves establishing a relationship between two or more tables.

The main reason for transforming the data is to get a better representation such that the transformed data is compatible with other data. In addition to this, interoperability in a system can be achieved by following a common data structure and format.

Having said that, let's start looking at data transformation techniques with data integration in the next section. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.104.183