Data visualization with matplotlib (and its pandas interface)

We had experience working with matplotlib already—first, in Chapter 3, Functions, and later, in Chapter 8, Simulation with Classes and Inheritance. Luckily, pandas has a built-in interface for working with matplotlib, making visualization very easy and intuitive. But first, we need to prepare the Notebook to display charts:

%matplotlib inline
import pylab as plt
plt.style.use('fivethirtyeight')

Now, let's plot the histogram of total casualties. It's possible to do that with the matplotlib itself—but pandas has a simple interface built-in. For better or worse, we still have to set labels and titles via a standard matplotlib interface—in this case, as follows:

data.loc[~mask, 'killed total'].hist(bins=20, figsize=(10,10))

plt.suptitle('Histogram, overall casualties per battle')
plt.xlabel("killed")
plt.ylabel("frequency")
plt.tight_layout();

We will get the following result:

We were able to gather a general understanding of the data, but some pieces are still missing. To drill down and understand the properties of significant subsets—for example, several events in a period of time or average casualties for specific fronts and operations—we need to aggregate our data.

Table of Contents for Data visualization with matplotlib (and its pandas interface)

Create new playlist

Sign In

Sign Up

Table of Contents for
Data visualization with matplotlib (and its pandas interface)