Installing Shark

As of the writing of this chapter, the latest version of Shark is v0.7.0 and it requires Spark 0.7.2 as well as a very recent JVM (Open JK7/Oracle HotSpot JDK7). Shark is available pre-built for both Hadoop 1 and Hadoop 2. As of the writing, the respective files are http://spark-project.org/download/shark-0.7.0-hadoop1-bin.tgz and http://spark-project.org/download/shark-0.7.0-hadoop2-bin.tgz. Once you have downloaded and extracted Shark, it's time to configure it. In this example, we will assume that you extracted in /home/spark/. Shark has a separate configuration from Spark, which lives at shark-0.7.0/conf/shark-env.sh. For local mode, you need to set up at least HIVE_HOME and SPARK_HOME like so:

export HIVE_HOME=/home/spark/hive-0.9.0-bin
export SPARK_HOME=/home/park/spark-0.7.2
source $SPARK_HOME/conf/spark-env.sh

In local mode, you also need to create a place for Hive to store its files, which by default is /user/hive/warehouse. Make sure to use the chown command in order to make the files accessible to your user like so:

mkdir -p /user/hive/warehouse && chown [your-spark-user] /user/hive/warehouse

If you are using Shark with a Spark cluster, you also need to set the MASTER and HADOOP_HOME variables. If you are using Shark with an existing Hive installation, you must set HIVE_CONF_DIR to the directory containing the Hive XML configuration files. If you add these after the source... line, you can reference the variables in the Spark configuration with:

export HADOOP_HOME=/path/to/hadoop
export MASTER=spark://$SPARK_MASTER_IP:7077

Once you have Shark installed and set up, you also need to copy Shark and its custom hive to all the workers nodes; do this with:

pscp -v -r -h ./spark-0.7.2/conf/slaves -l sparkuser ./shark-0.7.0 ~/
pscp -v -r -h ./spark-0.7.2/conf/slaves -l sparkuser ./hive-0.9.0-bin ~/

If you are doing an EC2-based setup, just use the latest AMI; it should already be set up for Shark.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.68.28