Chapter 1. Setting GNU R for Predictive Analytics

R is a relatively recent multi-purpose statistical language that originates from the older language S. R contains a core set of packages that includes some of the most common statistical tests and some data mining algorithms. One of the most important strengths of R is the degree to which its functionalities can be extended by installing packages made by users from the community. These packages can be installed directly from R, thereby making the process very comfortable. The Comprehensive R Archive Network (CRAN), which is available at http://cran.r-project.org, is a repository of packages, R sources, and R binaries (installers). It also contains the manuals for the packages. There are currently more than 4,500 available packages for R, and more are coming up regularly. Further, what is also great is that everything is free.

The topics covered in this chapter are:

  • Installation of R
  • R graphic user interface, including a description of the different menus
  • Definition of packages and how to install and load them
  • Along the way we will also discover parts of the syntax of R

Among almost 50 competitors, R is the most widely used tool for predictive modeling, together with RapidMiner, according to yearly software polls from KDnuggets (most recently available at http://www.kdnuggets.com/2015/05/poll-r-rapidminer-python-big-data-spark.html). Its broad use and the extent to which it is extendable make it an essential software package for data scientists. Competitors notably include Python, Weka, and Knime.

This book is intended for people who are familiar with R. This doesn't mean that people who do not have such a background cannot learn predictive analytics by using this book. It just means that they will require more time to use this book effectively, and might need to consult the basic R documentation along the way. With this extended readership in mind, we will just cover a few of the basics in this chapter while we set up R for predictive analytics. The writing style will be as accessible as possible. If you have trouble following through the first chapter, we suggest you first read a book on R basics before pursuing the following chapters, because the effort you will need to invest to understand and practice the content of this book will keep increasing from Chapter 2, Visualizing and Manipulating Data Using R. Unlike other chapters, this chapter explains basic information. Users who are more familiar with R are invited to skip to Chapter 2, Visualizing and Manipulating Data Using R or Chapter 3, Data Visualization with Lattice.

Installing GNU R

If this is not yet done, download the installer for your operating system on CRAN. Launch the installer and follow the specific instructions for your operating system. We will not examine these here as they are straightforward; just follow the instructions on screen. The following pages offer a quick reminder or a basic introduction to the interface in R. Here are the addresses where you can find the installers for each OS:

These links also serve as pointers to R under MacOS X and Linux, which are not fully described here.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.255.250