Data visualization with matplotlib (and its pandas interface)

We had experience working with matplotlib alreadyfirst, in Chapter 3Functions, and later, in Chapter 8, Simulation with Classes and Inheritance. Luckily, pandas has a built-in interface for working with matplotlib, making visualization very easy and intuitive. But first, we need to prepare the Notebook to display charts:

%matplotlib inline
import pylab as plt
plt.style.use('fivethirtyeight')

Now, let's plot the histogram of total casualties. It's possible to do that with the matplotlib itselfbut pandas has a simple interface built-in. For better or worse, we still have to set labels and titles via a standard matplotlib interfacein this case, as follows:

data.loc[~mask, 'killed total'].hist(bins=20, figsize=(10,10))

plt.suptitle('Histogram, overall casualties per battle')
plt.xlabel("killed")
plt.ylabel("frequency")
plt.tight_layout();

We will get the following result:

We were able to gather a general understanding of the data, but some pieces are still missing. To drill down and understand the properties of significant subsetsfor example, several events in a period of time or average casualties for specific fronts and operationswe need to aggregate our data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.15.177