The pandas library is part of the Python language, so we can now proceed to install pandas. At the time of writing this book, the latest stable version of pandas available is version 0.12. The various dependencies along with the associated download locations are as follows:
Package |
Required |
Description |
Download location |
---|---|---|---|
|
Required |
NumPy library for numerical operations | |
|
Required |
Date manipulation and utility library | |
|
Required |
Time zone support | |
|
Optional, recommended |
Speeding up of numerical operations | |
|
Optional, recommended |
Performance-related | |
|
Optional, recommended |
C-extensions for Python used for optimization | |
|
Optional, recommended |
Scientific toolset for Python | |
|
Optional |
Library for HDF5-based storage | |
|
Optional, recommended |
Matlab-like Python plotting library | |
|
Optional |
Statistics module for Python | |
|
Optional |
Library to read/write Excel files | |
|
Optional |
Libraries to read/write Excel files | |
|
Optional |
Library to access Amazon S3 | |
|
Optional |
Libraries needed for the read_html() function to work | |
|
Optional |
Library for parsing HTML | |
|
Optional |
Installing pandas is fairly straightforward for popular flavors of Linux. First, make sure that the Python .dev
files are installed. If not, then install them as explained in the following section.
For the Red Hat environment, run the following command:
yum install python-dev
Now, I will show you how to install pandas.
For installing pandas in the Ubuntu/Debian environment, run the following command:
sudo apt-get install python-pandas
Install Python-pandas via YaST Software Management or use the following command:
sudo zypper install python-pandas
Sometimes, additional dependencies may be needed for the preceding installation, particularly in the case of Fedora. In this case, you can try installing additional dependences:
sudo yum install gcc-gfortran gcc44-gfortran libgfortran lapack blas python-devel sudo python-pip install numpy
There are a variety of ways to install pandas on Mac OS X. They are explained in the following sections.
The pandas have a few dependencies for it to work properly, some are required and the others are optional, although needed for certain desirable features to work properly. This installs all the required dependencies:
easy_install
program:wget http://python-distribute.org/distribute_setup.pysudo python distribute_setup.py
sudo easy_install -U Cython
git clone git://github.com/pydata/pandas.git cd pandas sudo python setup.py install
The following methods describe the installation in the Windows environment.
Make sure that numpy
, python-dateutil
, and pytz
are installed first. The following commands need to be run for each of these modules:
C:Python27Scriptspip install python-dateutil
C:Python27Scriptspip install pytz
Install from the binary download, and run the binary for your version of Windows from https://pypi.python.org/pypi/pandas. For example, if your processor is an AMD64, you can download and install pandas by using the following commands:
pandas-0.16.1-cp26-none-win_amd64.whl (md5)
pip install pandas-0.16.1-cp26-none-win_amd64.whl
To test the install, run Python and type the following on the command prompt:
import pandas
If it returns with no errors then the installation was successful.
The steps here explain the installation completely:
MinGW
compiler by following the instructions in the documentation titled Appendix: Installing MinGW on Windows at http://docs.cython.org/src/tutorial/appendix.html.MingW
binary location is added to the PATH
variable, that has C:MingWin
appended to it.Cython
and Numpy
.Numpy
can be downloaded and installed from http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy.
Cython
can be downloaded and installed from http://www.lfd.uci.edu/~gohlke/pythonlibs/#cython
The steps to install Cython
are as follows:
C:Python27Scriptspip install Cython
to C:python27python
and run setup.py
install.setup.py
:distutils.errors.DistutilsError: Setup script exited with error: Unable to find vcvarsall.bat
This may have to do with not properly specifying mingw
as the compiler. Check that you have followed all the steps again.
Interactive Python (IPython) is a tool that is very useful for using Python for data analysis, and a brief description of the installation steps is provided here. IPython provides an interactive environment that is much more useful than the standard Python prompt. Its features include the following:
object_name?
to print details about objects.%run
magic command._
, __
, and __
variables, the %history
and other magic functions, and the up and down arrow keys.For more information, see the documentation at http://bit.ly/1Is4zIW.
IPython Notebook is the web-enabled version of IPython. It enables the user to combine code, numerical computation, and display graphics and rich media in a single document, the notebook. Notebooks can be shared with colleagues and converted to the HTML/PDF formats. For more information, refer to the documentation titled The IPython Notebook at http://ipython.org/notebook.html. Here is an illustration:
The preceding image of PYMC Pandas Example is taken from http://healthyalgorithms.files.wordpress.com/2012/01/pymc-pandas-example.png.
18.117.146.155