How it works...

The sort_values method can nearly replicate nlargest by chaining the head method after the operation, as seen in step 2. Step 3 replicates nsmallest by chaining another sort_values and completes the query by taking just the first five rows with the head method.

Take a look at the output from the first DataFrame from step 1 and compare it with the output from step 3. Are they the same? No! What happened? To understand why the two results are not equivalent, let's look at the tail of the intermediate steps of each recipe:

>>> movie2.nlargest(100, 'imdb_score').tail()
>>> movie2.sort_values('imdb_score', ascending=False) 
.head(100).tail()

The issue arises because more than 100 movies exist with a rating of at least 8.4. Each of the methods, nlargest and sort_values, breaks ties differently, which results in a slightly different 100-row DataFrame. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.227.9