Python is one of the most important programming languages used in data science. In this chapter, you’ll learn how to install Python and review some of the integrated development environments (IDEs) used for data analysis. You’ll also learn how to set up a working directory on your computer.
Installing Python
From the python.org ( http://python.org/ ) website, click Downloads then select the appropriate version to use based on your operating system. Then, follow the on-screen instructions to install Python.
Editor and IDEs
Writing code this way may prove to be somewhat cumbersome, so we use text editors or IDEs to facilitate the process.
There are many editors (those that are free and those that can be purchased) that differ in their completeness, scalability, and ease of use. Some are simple and some are more advanced. The most used editors include Sublime Text, Text Wrangler ( http://www.barebones.com/ ), Notepad++ ( http://notepad-plus-plus.org/download/v7.3.1.html ) (for Windows), or TextMate ( http://macromates.com/ ) (for Mac).
As for Python-specific IDEs , Wingware ( http://wingware.com/ ), Komodo ( http://www.activestate.com/komodo-ide ), Pycharm, and Emacs ( http://www.gnu.org/software/emacs/ ) are popular, but there are plenty of others. They provide tools to simplify work, such as self-completion, auto-editing and auto-indentation, integrated documentation, syntax highlighting, and code folding (the ability to hide some pieces of code while you works on others), and to support debugging.
Spyder (which is included in Anaconda ( http://www.continuum.io/downloads )) and Jupyter ( http://jupyter.readthedocs.io/en/latest/ ), that you can download from the website www.anaconda.com , are the IDEs used most in data science, along with Canopy. A useful tool in Jupyter is nbviewer, which allows the exchange of Jupyter’s .ipynb files, and can be downloaded from http://nbviewer.jupyter.org . nbviewer can also be linked to GitHub.
Differences between Python2 and Python3
Python was released in two different versions: Python2 and Python3. Python2 was born in 2000 (currently, the latest release is 2.7) and its support is expected to continue until 2020. It is the historical and most complete version.
Python3 was released in 2008 (current version is 3.6). There are many libraries in Python3, but not all of them have been converted from Python2 for Python3.
Mathematical Operations in Python 2.7
Mathematical Operations in Python 3.5.2
For a closer look at the differences between the two versions of Python, access this online resource ( http://sebastianraschka.com/Articles/2014_python_2_3_key_diff.html ).
Why choose one version of Python over the other? Python2 is the best-defined and most stable version, whereas Python3 represents the future of the language, although the two versions may not always coincide. In the first part of this book, I highlight the differences between the two versions. However, beginning with Chapter 7 and moving to the end of the book, we will use Python3.
Let’s start by setting up a work directory. This directory will house our files.
Work Directory
Python checks whether there is a file with that name inside that folder and imports it. The same thing happens when we save a Python file by typing it on a computer. Python automatically puts it in that folder. Even when we run a Python script, as we will see, we have to access the folder where the script (the work directory or another one) is located directly from the terminal.
Now let’s make sure that you understand the difference between using a the terminal and starting a session in our favorite programming language.
Using a Terminal
test.py is the name of the script I am going to run.
Summary
In this chapter we learned how to install Python and I reviewed some of the various IDEs we can use for data analysis. We also examined Python2 and Python3, and learned how to set up a work directory on a terminal.