Time for action – importing Hive data into MySQL

Regardless of these limitations, let's demonstrate that, in the right situations, we can use Sqoop to directly export data stored in Hive.

  1. Remove any existing data in the employee table:
    $ echo "truncate employees" | mysql –u hadoopuser –p hadooptest
    

    You will receive the following response:

    Query OK, 0 rows affected (0.01 sec)
    
  2. Check the contents of the Hive warehouse for the employee table:
    $ hadoop fs –ls /user/hive/warehouse/employees
    

    You will receive the following response:

    Found 1 items
    … /user/hive/warehouse/employees/part-m-00000
    
  3. Perform the Sqoop export:
    sqoop export --connect jdbc:mysql://10.0.0.100/hadooptest 
    --username hadoopuser –P --table employees 
    --export-dir /user/hive/warehouse/employees  
    --input-fields-terminated-by '01' 
    --input-lines-terminated-by '
    '
    
    Time for action – importing Hive data into MySQL

What just happened?

Firstly, we truncated the employees table in MySQL to remove any existing data and then confirmed the employee table data was where we expected it to be.

Note

Note that Sqoop may also create an empty file in this directory with the suffix _SUCCESS; if this is present it should be deleted before running the Sqoop export.

The Sqoop export command is like before; the only changes are the different source location for the data and the addition of explicit field and line terminators. Recall that Hive, by default, uses ASCII code 001 and for its field and line terminators, respectively (also recall, though, that we have previously imported files into Hive with other separators, so this is something that always needs to be checked).

We execute the Sqoop command and watch it fail due to Java IllegalArgumentExceptions when trying to create instances of java.sql.Date.

We are now hitting the reverse of the problem we encountered earlier; the original type in the source MySQL table had a datatype not supported by Hive, and we converted the data to match the available type of TIMESTAMP. When exporting data back again, however, we are now trying to create a DATE using a TIMESTAMP value, which is not possible without some conversion.

The lesson here is that our earlier approach of doing a one-way conversion only worked for as long as we only had data flowing in one direction. As soon as we need bi-directional data transfer, mismatched types between Hive and the relational store add complexity and require the insertion of conversion routines.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.174.23