Adding a column

We can do so using the := notation. This is a notation that is already available in R, but generally not used. It allows the update to happen in-place. In other words, it avoids making a copy of the dataset in order to add a new column, as shown:

dstate[,Region:=state.region] 
dstate[1:3] # We can see that the Region column has been added 

We can use := to add multiple columns.

For instance, to add the division and abbrevation of each state, we can use the following:

dstate[,c("Division","Abb"):=.(state.division, state.abb)] 
dstate[1:3] # We can see that the new columns, Division and Abb have been added 

To find the sum of Population grouped by Region, we can use the following:

dstate[,.(Sum_Pop=sum(Population)),by=Region]  

To find the sum of Population grouped by Region and Division, we can use the following:

dstate[,.(Sum_Pop=sum(Population)),by=.(Region,Division)]  

Notice that we always use the . notation when performing such operations. data.table is in essence, a list of lists, that is, each column is a list. Hence, in order to group by a column, instead of using the c() notation (which represents vectors), we use . or list().

If we had to perform an operation across multiple columns, we can use the inbuilt .SD symbol (which stands for subset of data). For instance, to find the first row corresponding to each region, we can use .SD with head() as follows:

 dstate[,head(.SD,1), by=.(Region)] 

We can also modify the .SD symbol to operate on only a fixed set of columns. For example, to find the maximum of Population and Income grouped by Region using .SD, we can use it in conjunction with lapply() to run the max() function for each column specified in .SD as shown:

 dstate[,lapply(.SD,max), by=.(Region), .SDcols=c("Population","Income")] 

To perform the same operation as shown previously and add the minimum per Region, we can use the following:

dstate[,c(lapply(.SD, max), lapply(.SD, min)), by=.(Region), .SDcols=c("Population","Income")]  
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.181.36