For the following recipes, you will need Python installed on your computer and you will need the world's top incomes dataset. This recipe will help ensure you have set up everything you need to complete this analysis project.
To step through this recipe, you will need a computer with access to the Internet.
Make sure you have downloaded and installed Python and the necessary Python libraries to complete this project.
Refer to Chapter 1, Preparing Your Data Science Environment, to set up a Python development environment using virtualenv and install the required libraries for matplotlib and NumPy.
The following steps will guide you to download the world's top incomes dataset and install the necessary Python libraries to complete this project:
The original dataset for the world's top incomes can be downloaded from http://topincomes.g-mond.parisschoolofeconomics.eu/. However, the site has been updated several times, which has changed the output format of the data (from .csv
to .xlsx
). This recipe assumes a .csv
file format.
This chapter's repository contains the properly formatted version of the input data file.
>>> import numpy as np >>> import jinja2 >>> import matplotlib as plt
NumPy is the fundamental scientific computing library for Python; it is therefore essential to any data science toolkit, and we will leverage it in many places throughout the Python chapters. However, since NumPy is an external library that must be compiled for your system, we will discuss alternative native-Python approaches alongside the NumPy approach.
18.191.253.62