Let's download and get Sqoop installed and configured.
$mv sqoop-1.4.1-incubating__hadoop-1.0.0.tar.gz_ /usr/local $ cd /usr/local $ tar –xzf sqoop-1.4.1-incubating__hadoop-1.0.0.tar.gz_
$ ln -s sqoop-1.4.1-incubating__hadoop-1.0.0 sqoop
$ export SQOOP_HOME=/usr/local/sqoop $ export PATH=${SQOOP_HOME}/bin:${PATH}
lib
directory:$ cp mysql-connector-java-5.0.8-bin.jar /opt/sqoop/lib
$ sqoop help
You will see the following output:
usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records … version Display version information See 'sqoop help COMMAND' for information on a specific command.
Sqoop is a pretty straightforward tool to install. After downloading the required version from the Sqoop homepage—being careful to pick the one that matches our Hadoop version—we copied and unpacked the file.
Once again, we needed to set an environment variable and added the Sqoop bin
directory to our path so we can either set these directly in our shell, or as before, add these steps to a configuration file we can source prior to a development session.
Sqoop needs access to the JDBC driver for your database; for us, we downloaded the MySQL Connector and copied it into the Sqoop lib
directory. For the most popular databases, this is as much configuration as Sqoop requires; if you want to use something exotic, consult the Sqoop documentation.
After this minimal install, we executed the sqoop
command-line utility to validate that it is
working properly.
We were very specific in the version of Sqoop to be retrieved before; much more so than for previous software downloads. In Sqoop versions prior to 1.4.1, there is a dependency on an additional method on one of the core Hadoop classes that was only available in the Cloudera Hadoop distribution or versions of Hadoop after 0.21.
Unfortunately, the fact that Hadoop 1.0 is effectively a continuation of the 0.20 branch meant that Sqoop 1.3, for example, would work with Hadoop 0.21 but not 0.20 or 1.0. To avoid this version confusion, we recommend using version 1.4.1 or later, which removes the dependency.
There is no additional MySQL configuration required; we would discover if the server had not been configured to allow remote clients, as described earlier, through use of Sqoop.
3.145.78.136