Chapter 4. Restructuring Data

We already covered the most basic methods for restructuring data in the Chapter 3, Filtering and Summarizing Data, but of course, there are several other, more complex tasks that we will master in the forthcoming pages.

Just to give a quick example on how diversified tools are needed for getting the data in a form that can be used for real data analysis: Hadley Wickham, one of the best known R developers and users, spent one third of his PhD thesis on reshaping data. As he says, "it is unavoidable before doing any exploratory data analysis or visualization."

So now, besides the previous examples of restructuring data, such as the counting of elements in each group, we will focus on some more advanced features, as listed next:

  • Transposing matrices
  • Splitting, applying, and joining data
  • Computing margins of tables
  • Merging data frames
  • Casting and melting data

Transposing matrices

One of the most used, but often not mentioned, methods for restructuring data is transposing matrices. This simply means switching the columns with rows and vice versa, via the t function:

> (m <- matrix(1:9, 3))
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
> t(m)
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

Of course, this S3 method also works with data.frame, and actually, with any tabular object. For more advanced features, such as transposing a multi-dimensional table, take a look at the aperm function from the base package.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.237.29