Part II: Extract, Transform, and Load

This part, Part II, discusses the process that populates a dimensional data warehouse. This process is known as ETL, short for Extract, Transform, and Load. Extract is getting the data you need for the data warehouse from the source. Transform is the process of preparing the data. And Load is the process of storing the data in the data warehouse.

The E, T, or L is not always a distinct step. For instance, if the source data is in a MySQL database, the ETL can be a single “INSERT SELECT” SQL statement. In other cases, the Transform portion can be quite involved, requiring not only adding surrogate keys and preparing the history maintenance, but also integrating multiple sources, handling data source errors, and aggregating.

This part covers the following topics.

  • Extracting source data

  • Populating the date dimension

  • Initial population

  • Regular population

  • Regular job and scheduling


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.157.6