Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Indexing or subsetting dataframes

While working on a client dataset with a large number of observations, it is required to subset the data based on some selection criteria and with or without replacement-based sampling. Indexing is the process of extracting the subset of data from the dataframe based on some logical conditions. The subset function helps in extracting elements from the data frame like indexing:

> newdata <- audit[ which(audit$Gender=="Female" & audit$Age > 65), ]
> rownames(newdata)
 [1] "49"   "537"  "552"  "561"  "586"  "590"  "899"  "1200" "1598" "1719"

The preceding code explains: select those observations from the audit dataset where the gender is female and the age is more than 65 years. Which command is used to select that subset of data audit based on the preceding two criteria? There are 10 observations satisfying the preceding condition; the row numbers of the data frame are printed previously. A similar result can be obtained by using the subset function as well. Instead of the which function, the subset function should be used, as the latter is more efficient in passing multiple conditions. Let's take a look at the way the subset function is used:

> newdata <- subset(audit, Gender=="Female" & Age > 65, select=Employment:Income)
> rownames(newdata)
 [1] "49"   "537"  "552"  "561"  "586"  "590"  "899"  "1200" "1598" "1719"

The additional argument in the subset function makes the function more efficient as it provides the additional benefit of selecting specific columns from the dataframe where the logical condition is satisfied.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Indexing or subsetting dataframes

Create new playlist

Sign In

Sign Up

Indexing or subsetting dataframes

Table of Contents for
Indexing or subsetting dataframes