How to do it...

  1. Read in the movie dataset, set the index to the movie_title, and create the first set of criteria:
>>> movie = pd.read_csv('data/movie.csv', index_col='movie_title')
>>> crit_a1 = movie.imdb_score > 8
>>> crit_a2 = movie.content_rating == 'PG-13'
>>> crit_a3 = (movie.title_year < 2000) | (movie.title_year > 2009)
>>> final_crit_a = crit_a1 & crit_a2 & crit_a3
  1. Create criteria for the second set of movies:
>>> crit_b1 = movie.imdb_score < 5
>>> crit_b2 = movie.content_rating == 'R'
>>> crit_b3 = ((movie.title_year >= 2000) &
(movie.title_year <= 2010))
>>> final_crit_b = crit_b1 & crit_b2 & crit_b3
  1. Combine the two sets of criteria using the pandas or operator. This yields a boolean Series of all movies that are members of either set:
>>> final_crit_all = final_crit_a | final_crit_b
>>> final_crit_all.head()
movie_title
Avatar False Pirates of the Caribbean: At World's End False Spectre False The Dark Knight Rises True Star Wars: Episode VII - The Force Awakens False dtype: bool
  1. Once you have your boolean Series, you simply pass it to the indexing operator to filter the data:
>>> movie[final_crit_all].head()
  1. We have successfully filtered the data and all the columns of the DataFrame. We can't easily perform a manual check to determine whether the filter worked correctly. Let's filter both rows and columns with the .loc indexer:
>>> cols = ['imdb_score', 'content_rating', 'title_year']
>>> movie_filtered = movie.loc[final_crit_all, cols]
>>> movie_filtered.head(10)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.163.13