Data wrangling

Well, Figure 12.2 shows that each column has the same number of items, indicating there are no missing values.

We can verify that by using the pd.info() method shown here:

df_red.info()

The output of the preceding code is given:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1599 entries, 0 to 1598
Data columns (total 12 columns):
fixed acidity 1599 non-null float64
volatile acidity 1599 non-null float64
citric acid 1599 non-null float64
residual sugar 1599 non-null float64
chlorides 1599 non-null float64
free sulfur dioxide 1599 non-null float64
total sulfur dioxide 1599 non-null float64
density 1599 non-null float64
pH 1599 non-null float64
sulphates 1599 non-null float64
alcohol 1599 non-null float64
quality 1599 non-null int64
dtypes: float64(11), int64(1)
memory usage: 150.0 KB

As shown in the preceding output, none of the columns have a null value. Since there are no null entries, we don't need to deal with the missing values. Assuming there were some, then we would take care of them using techniques we outlined in Chapter 4, Data Transformation.

We can also access the data quality and missing values using the ways shown in Chapter 4, Data Transformation. We can use the pandas method, df_red.isnull().sum().

Knowing that there is no need for further data transformation steps, let's just go over the data analysis of the red wine in the next section.

Table of Contents for Data wrangling

Create new playlist

Sign In

Sign Up

Table of Contents for
Data wrangling