How it works...

In step 1, we form groups by the origin and destination airport columns and then apply the size method to the groupby object, which simply returns the total number of rows for each group. Notice that we could have passed the string size to the agg method to achieve the same result. In step 2, the total number of flights for each direction between Atlanta and Houston are selected. The Series flights_count has a MultiIndex with two levels. One way to select rows from a MultiIndex is to pass the loc indexing operator a tuple of exact level values. Here, we actually select two different rows, ('ATL', 'HOU') and ('HOU', 'ATL'). We use a list of tuples to do this correctly.

Step 3 is the most pertinent step in the recipe. We would like to have just one label for all flights between Atlanta and Houston and so far we have two. If we alphabetically sort each combination of origin and destination airports, we would then have a single label for flights between airports. To do this, we use the DataFrame apply method. This is different from the groupby apply method. No groups are formed in step 3.

The DataFrame apply method must be passed a function. In this case, it's the built-in sorted function. By default, this function gets applied to each column as a Series. We can change the direction of computation by using axis=1 (or axis='index'). The sorted function has each row of data passed to it implicitly as a Series. It returns a list of sorted airport codes. Here is an example of passing the first row as a Series to the sorted function:

>>> sorted(flights.loc[0, ['ORG_AIR', 'DEST_AIR']])
['LAX', 'SLC']

The apply method iterates over all rows using sorted in this exact manner. After completion of this operation, each row is independently sorted. The column names are now meaningless. We rename the column names in the next step and then perform the same grouping and aggregating as was done in step 2. This time, all flights between Atlanta and Houston fall under the same label.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.71.94