Another function we can use is the filter function. The filter function takes a data frame as an argument and a filtering statement. The function passes over each row of the data frame and returns those rows that meet the filtering statement:
#filter only players with over 200 hits in a season
over200 <- filter(players, h > 200)
head(over200)
nrow(over200)
it looks like many players were capable of 200 hits a season. How about if we look at those players that could also get over 40 home runs in a season?
over200and40hr <- filter(players, h > 200 & hr > 40)
head(over200and40hr)
nrow(over200and40hr)
It's a very small list. I know that player names are somewhat mangled, but you can recognize a few, such as Babe Ruth.
I wonder if any of the players hit over 300 times in a season.
filter(players, h > 300)
It's interesting that no records met our filter, but the results handler requires a number of columns, and throws an error, as in this case there are none. Usually, errors in R are due to programming errors. It is unusual for R to generate an error for what I think would be normal no result data.