Applying a function to a pandas series or DataFrame

In this section, we will learn how to apply Python's pre-built and self-built functions to pandas data objects. We'll also learn about applying functions to a pandas series and pandas DataFrame.

We will begin by importing the pandas module into our Jupyter Notebook:

import pandas as pd 
import numpy as np

We will then read in our CSV dataset:

data = pd.read_csv('data-titanic.csv')
data.head()

Let's proceed to applying functions using pandas' apply method. In this example, we will creating a function using lambda, as follows:

func_lower = lambda x: x.lower()

Here, we pass a value x and converting it into lowercase. We then apply this function to the Name field in the dataset using the apply() method, as shown here:

data.Name.apply(func_lower)

If you look closely, the values in the Name field have been converted into lowercase. Next, we see how to apply functions to values in multiple columns, or a whole DataFrame. We can use the applymap() method for this. It works in a fashion similar to the apply() method, but on multiple columns, or on the whole DataFrame. The following code depicts how to apply the applymap() method to the Age and Pclass columns:

data[['Age', 'Pclass']].applymap(np.square)

We also blast the secure method of Numpy apply to these two columns.

The preceding steps are for a predefined function. Let's now proceed to create our own function and then apply it to the values, as follows:

def my_func(i):
        return i + 20

The function that was created is a simple function that takes a value, adds 20 to it, and then returns the result. We use applymap() to apply this function to every value in the Age and Pclass columns, as shown here:

data[['Age', 'Pclass']].applymap(my_func)

Let's move on to learning about merging and concatenating multiple DataFrames into one.

Table of Contents for Applying a function to a pandas series or DataFrame

Create new playlist

Sign In

Sign Up

Table of Contents for
Applying a function to a pandas series or DataFrame