Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Time for action – using a type mapping

Let's use a type mapping to improve our data import.

Delete any existing output directory:
```
$ hadoop fs -rmr employees
```

Execute Sqoop with an explicit type mapping:

sqoop import --connect jdbc:mysql://10.0.0.100/hadooptest --username hadoopuser 
-P --table employees 
--hive-import --hive-table employees 
--map-column-hive start_date=timestamp

You will receive the following response:

12/05/23 14:53:38 INFO hive.HiveImport: Hive import complete.

Examine the created table definition:

$ hive -e "describe employees"

You will receive the following response:

OK
first_name  string  
dept  string  
salary  int  
start_date  timestamp  
Time taken: 2.547 seconds

Examine the imported data:

$ hive -e "select * from employees";

You will receive the following response:

OK
Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
Time taken: 2.73 seconds

What just happened?

Our Sqoop command line here is similar to our original Hive import, except for the addition of the column mapping specification. We specified that the start_date column should be of type TIMESTAMP, and we could have added other specifications. The option takes a comma-separated list of such mappings.

After confirming Sqoop executed successfully, we examined the created Hive table and verified that the mapping was indeed applied and that the start_date column has type TIMESTAMP.

We then tried to retrieve the data from the table and could not do so, receiving an error about type format mismatch.

On reflection, this should not be a surprise. Though we specified the desired column type was to be TIMESTAMP, the actual data being imported from MySQL was of type DATE, which does not contain the time component required in a timestamp. This is an important lesson. Ensuring that the type mappings are correct is only one part of the puzzle; we must also ensure the data is valid for the specified column type.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Time for action – using a type mapping

Create new playlist

Sign In

Sign Up

Time for action – using a type mapping

What just happened?

Table of Contents for
Time for action – using a type mapping