Working with Big Data inR | 221
• For rhdfs package:rhdfs_1.0.8.tar.gz
• For rhbase package:rhbase_1.2.1.tar.gz
• For rmr2 package:rmr2_3.3.1.tar.gz
• For plyrmr package:plyrmr_0.6.0.tar.gz
• For ravro package:ravro_1.0.4.tar.gz
The les are stored in the Downloads folder (/home/<usrname>/Downloads). Before installing
each of the above packages, all the other packages on which these packages are dependent on
need to be installed. Following is a quick step-by-step guide on what to install and how.
A. Let’s first start with the rmr2 package. It has a dependency on caTools package. So, here is the
sequence of installation steps.
1. Install caTools package from within the R console (or Rstudio) using the following
command.
>install.packages(“caTools”)
In case if there is an error, then you may try the extended version of the command.
>install.packages(“caTools”, repos=”https://cran.rstudio.com”,
dependencies = TRUE)
2. Then come out of the R console to the Ubuntu prompt and run the installation for rmr2.
amit@amit-Lenovo-Z51-70:~$sudo HADOOP_CMD=/usr/bin/hadoop R CMD
INSTALL /home/amit/Downloads/rmr2_3.3.1.tar.gz
B. Next let’s install the plyrmr package. For that the dependencies are rmr2 (which is already
installed), R.methodsS3, Hmisc and rjson. Again, the package Hmisc has a dependency on ace-
pack, which can be installed if gfortran is installed. Hence, we need to start with gfortran and
install it using the following set of commands from Ubuntu prompt.
$ sudo -i
$ apt-get update
$ apt-get install gfortran
Next, we should install acepack using the following command from R console (or RStudio).
>install.packages(“acepack”,repos= “https://cran.rstudio.com”,
dependencies = TRUE)
Similarly, we shall install the packages Hmisc and R.methodsS3.
>install.packages(“Hmisc”,repos= “https://cran.rstudio.com”,
dependencies = TRUE)
>install.packages(“R.methodsS3”,repos= “https://cran.rstudio.com”,
dependencies = TRUE)
Eventually, we install plymr from the Ubuntu prompt.
$ sudo HADOOP_CMD=/usr/bin/hadoop R CMD INSTALL /home/amit/
Downloads/plyrmr_0.6.0.tar.gz
M08 Big Data Simplified XXXX 01.indd 221 5/10/2019 10:01:18 AM