Chapter 6. Transforming the Row Set

So far, you have been working with simple datasets, that is, datasets where the each row represented a different entity (for example a student) and each column represented a different attribute for that entity (for example student name). There are occasions when your dataset doesn’t resemble such a simple format, and working with it as is, may be complicate or even impossible. In other occasions your data simply does not have the structure you like or the structure you need.

Whichever your situation, you have to transform the dataset in an appropriate format and the solution is not always about changing or adding fields, or about filtering or adding rows. Sometimes it has to do with twisting the whole dataset. In this chapter you will learn how to:

  • Convert rows to columns
  • Convert columns to rows
  • Operate on sets of rows

You will also be introduced to a core subject in data warehousing: Time dimensions.

Converting rows to columns

In most datasets each row belongs to a different element such as a different match or a different student. However, there are datasets where a single row doesn't completely describe one element. Take, for example, the real-estate file from Chapter 5. Every house was described through several rows. A single row gave incomplete information about the house. The ideal situation would be one in which all the attributes for the house were in a single row. With PDI you can convert the data into this alternative format. You will learn how to do it in this section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.231.194