Merging on index

Sometimes the keys for merging dataframes are located in the dataframes index. In such a situation, we can pass left_index=True or right_index=True to indicate that the index should be accepted as the merge key.

Merging on index is done in the following steps:

  1. Consider the following two dataframes:
left1 = pd.DataFrame({'key': ['apple','ball','apple', 'apple', 'ball', 'cat'], 'value': range(6)})
right1 = pd.DataFrame({'group_val': [33.4, 5]}, index=['apple', 'ball'])

If you print these two dataframes, the output looks like the following screenshot:

Note that the keys in the first dataframe are apple, ball, and cat. In the second dataframe, we have group values for the keys apple and ball.

  1. Now, let's consider two different cases. Firstly, let's try merging using an inner join, which is the default type of merge. In this case, the default merge is the intersection of the keys. Check the following example code:
df = pd.merge(left1, right1, left_on='key', right_index=True)
df

 The output of the preceding code is as follows:

The output is the intersection of the keys from these dataframes. Since there is no cat key in the second dataframe, it is not included in the final table. 

  1. Secondly, let's try merging using an outer join, as follows:
df = pd.merge(left1, right1, left_on='key', right_index=True, how='outer')
df

The output of the preceding code is as follows:

Note that the last row includes the cat key. This is because of the outer join.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.204.142