Let's now set up Hive so we can start using it in action.
$ mv hive-0.8.1.tar.gz /usr/local
$ tar –xzf hive-0.8.1.tar.gz
HIVE_HOME
variable to the installation directory:$ export HIVE_HOME=/usr/local/hive
$ export PATH=${HIVE_HOME}/bin:${PATH}
$ hadoop fs -mkdir /tmp $ hadoop fs -mkdir /user/hive/warehouse
$ hadoop fs -chmod g+w /tmp $ hadoop fs -chmod g+w /user/hive/warehouse
$ hive
You will receive the following response:
Logging initialized using configuration in jar:file:/opt/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties Hive history file=/tmp/hadoop/hive_job_log_hadoop_201203031500_480385673.txt hive>
$ hive> quit;
After downloading the latest stable Hive release, we copied it to the desired location and uncompressed the archive file. This created a directory, hive-<version>
.
Similarly, as we previously defined HADOOP_HOME
and added the bin
directory within the installation to the path variable, we then did something similar with HIVE_HOME and its bin directory.
We then created two directories on HDFS that Hive requires and changed their attributes to make them group writeable. The /tmp
directory is where Hive will, by default, write transient data created during query execution and will also place output data in this location. The /user/hive/warehouse
directory is where Hive will store the data that is written into its tables.
After all this setup, we run the hive
command and a successful installation will give output similar to the one mentioned above. Running the hive
command with no arguments enters an interactive shell; the hive>
prompt is analogous to the sql>
or mysql>
prompts familiar from relational database interactive tools.
We then exit the interactive shell by typing quit;
. Note the trailing semicolon ;
. HiveQL is, as mentioned, very similar to SQL and follows the convention that all commands must be terminated by a semicolon. Pressing Enter without a semicolon will allow commands to be continued on subsequent lines.
18.188.198.94