Development and deployment

We'll go through the whole development and deployment steps with an example. Let's create a simple function called toUpper, which converts a string to uppercase, by following development and deployment steps:

  1. Download and install a Java IDE, such as Eclipse or IntelliJ IDEA.
  2. Start the IDE and create a Java project
  3. Right-click on the project to choose the Build Path | Configure Build Path | Add External Jars option. It will open a new window. Navigate to the directory with the library of Hive and Hadoop. Then, select and add all JAR files we need to import. We can also resolve the library dependency automatically by using Maven (http://maven.apache.org/); the proper pom.xml file is given in the sample code for this book to import as a maven project.
  4. In the IDE, create the following ToUpper.java file according to the UDF template mentioned previously:
      package hive.essentials.hiveudf;
      
      import org.apache.hadoop.hive.ql.exec.UDF;
      import org.apache.hadoop.io.Text;
      
      class ToUpper extends UDF {
        public Text evaluate(Text input) {
          if(input == null) return null;
          return new Text(input.toString().toUpperCase());
        }
      }
  1. Compile and build the project JAR file as hiveudf-1.0.jar.
  2. Upload the JAR file to HDFS with the hdfs dfs -put hiveudf-1.0.jar /app/hive/function/ command.
  3. Create the function as a temporary function that is only valid in the current session. As of Hive v0.13.0, we can also create a permanent function, which is permanently registered to the metastore and can be referenced in all queries and sessions:
      > CREATE TEMPORARY FUNCTION tmptoUpper 
> as 'com.packtpub.hive.essentials.hiveudf.toupper';
> USING JAR 'hdfs:///app/hive/function/hiveudf-1.0.jar';

> CREATE FUNCTION toUpper -- Create permanent function
> as 'hive.essentials.hiveudf.ToUpper'
> USING JAR 'hdfs:///app/hive/function/hiveudf-1.0.jar';
  1. Verify and check the function:
      > SHOW FUNCTIONS ToUpper;
> DESCRIBE FUNCTION ToUpper;
> DESCRIBE FUNCTION EXTENDED ToUpper;
+----------------------------------------------------+
| tab_name |
+----------------------------------------------------+
| toUpper(value) - Returns upper case of value. |
| Synonyms: default.toupper |
| Example: |
| > SELECT toUpper('will'); |
| WILL |
| Function class:hive.essentials.hiveudf.ToUpper |
| Function type:PERSISTENT |
| Resource:hdfs:///app/hive/function/hiveudf-1.0.jar |
+----------------------------------------------------+
  1. Reload and use the function in HQL:
      > RELOAD FUNCTION; -- Reload all invisible functions if needed

> SELECT
> name, toUpper(name) as cap_name, tmptoUpper(name) as cname
> FROM employee;
+---------+----------+----------+
| name | cap_name | c_name |
+---------+----------+----------+
| Michael | MICHAEL | MICHAEL |
| Will | WILL | WILL |
| Shelley | SHELLEY | SHELLEY |
| Lucy | LUCY | LUCY |
+---------+----------+----------+
4 rows selected (0.363 seconds)
  1. Drop the function when needed:
      > DROP TEMPORARY FUNCTION IF EXISTS tmptoUpper;
> DROP FUNCTION IF EXISTS toUpper;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.123.189