Installing Python

We proceed with the installation as follows:

  1. Similar to R, Python has gained popularity due to its versatile and diverse range of packages. Python is generally available as part of most modern Linux-based operating systems. For our exercises, we will use Anaconda from Continuum Analytics®, which enhances the base open source Python offering with many data-mining- and machine-learning-related packages that are installed natively as part of the platform. This alleviates the need for the practitioner to manually download and install packages. In that sense, it is conceptually similar in spirit to Microsoft R Open. Just as Microsoft R enhances the base open source R offering with additional functionality, Anaconda improves upon the offerings of base open source Python to provide new capabilities.
  1. Steps for installing Anaconda Python
  2. Go to
Python Anaconda Homepage
  1. Download the distribution that is appropriate for your system. Note that we'll be downloading Python v2.7 (and not the 3.x version):
Selecting the Python Anaconda Installer
  1. Once the installation is complete, you should be able to go to a Terminal Window (or the Command Window in Windows) and type in Python, which will start up Anaconda:
Launching Python Anaconda in the console

This concludes the process of installing Hadoop (CDH), Spark, R, and Python. In later chapters, we will investigate these platforms in further detail.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.