Scripting in HBase

We have seen how to do scripting in HBase. In this chapter, we will see some more scripting tips and tricks, which will enable an administrator to perform various tasks in HBase by automating. We can write scripts in Ruby, shell script, and a script that's a combination of HBase commands.

Now, let's consider a case where we need to create a table with two column families and two columns, and then insert some data. The script for the same is as follows:

Tip

Here, we used a vi editor. Users can use any editor of his/her preference.

vi hbasescript.script

create 'table','data',
for i in '0'..'2' do 
for j in '0'..'2' do 
for k in '0'..'2' do 
put 'table', "row-#{i}#{j}#{k}","data:column#{j}#{k}", "name#{j}#{k}" end end end

After saving this script, we can run the following script:

hbase shell hbasescript.script

We can also do the same thing by going to HBase shell:

hbase > for i in '0'..'5' do 
hbase >* put "utable", "rowKey_#{i}", "address:address", "address#{i}"
hbase>* end

The preceding commands will insert five rows in the utable.

The preceding script will create a table and put 10 rows of data in the table. Likewise, we can write scripts to load data into the table and perform various operations such as inserting data from a text or CSV file.

We can run an HBase command to create an HBase table without going to the HBase shell, as follows:

echo "create 'tableToCreate', 'colFamily'" | hbase shell

Now, we will see a script to scan the table between two rows:

vi scanTable.sh

#!/bin/bash
 TableToScan=$1
RowStart=$2
RowEnd=$3
exec hbase shell <<EOF scan "${TableToScan}", {RowStart => "${RowStart}", RowEnd => "${RowEnd}"}
EOF

This code must be called ./scanTable.sh emptable row100 row1000. This will display rows between row100 and row1000 (which are passed as parameters to the script) from the emptable table.

The .irbrc file

As we know, HBase uses Ruby shell, and this can be customized using the .irbrc file to perform commands such as clearing, maintaining history in HBase shell, and so on. If this file does not already exist in a user's home directory, we can create it and put the following content, which will enable us to use the clear command on HBase shell to clear the screen and maintain a command history for HBase shell:

  1. From the home directory, issue the following command and add the following lines to the file:
    vi .irbrc
    
    #Clear HBase shell command
    def clear
      system('clear')
    end
    
    hadoop_home="<your hadoop home path here>"
    
    #Enable history(commands executed previously will be preserved) in hbase shell
    require "irb/ext/save-history"
    #No. of commands to be saved. 50 here
    IRB.conf[:SAVE_HISTORY] = 50
    # The location to save the history file
    IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"
    
    #List given HDFS path from hbase shell
    def ls(path)
      directory="/"+path
      system("#{hadoop_home}/hadoop fs -ls #{directory}")
    end
    #<hadoop home path> is the full path of the hadoop directory
    
    Kernel.at_exit do
      IRB.conf[:AT_EXIT].each do |i|
        i.call
      end
    
    end
    
  2. Save this file, and now we can execute the clear and directory commands from HBase shell as:
    hbase > clear
    hbase > ls <directory ls>
    
  3. We can also assign variables to commands on HBase shell, and use it as follows:
    hbase > var = create 'table','colFam'
    
  4. Now, we can use var to perform operations on the table, as follows:
    hbase > var.scan 
    

    We will scan table, and likewise, we can use the put, get, and other commands of HBase with this variable.

  5. If a table is already created, we can assign a variable for an HBase command, as follows:
    hbase > var = get_table 'table'
    
  6. Now, we can use the var variable on HBase shell to perform various operations on the given table, as follows:
    hbase > var.scan
    hbase . var.put 'row','colfam:name','shashwat'
    hbase > var.disable
    

Likewise, we can use all the commands related to a table.

Getting the HBase timestamp from HBase shell

We can use HBase shell to get the date and time converted to the HBase timestamp, which is useful while specifying the timestamp in some commands in HBase, as follows:

hbase > import java.text.SimpleDateFormat
hbase > import java.text.ParsePosition
hbase > SimpleDateFormat.new("").parse("", ParsePosition.new(0)).getTime()

The following is an example:

hbase > SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("14/07/01 09:00:00", ParsePosition.new(0)).getTime()

These three commands will give the specified date-time data in HBase timestamp, which we can use to scan or for some other commands.

For example, here we need a timestamp in the get command, as follows:

get 'tableToGetDataFrom', 'row1', {COLUMN => 'colFam:Name', TIMESTAMP => 1317945301466}

We can get the date-time data from an HBase timestamp, as follows:

hbase > import java.util.Date
hbase > Date.new(1317945301466).toString() 

This will show the equivalent date-time format of the specified timestamp.

Enabling debugging shell

We can execute the following command to enable more output on HBase shell about the commands we are executing:

hbase > debug

This will display more of the stack trace while being on HBase shell and executing commands.

Enabling the debug level in HBase shell

We can enable the debug level on HBase shell using the following command:

hbase shell –d 

Enabling SQL in HBase

Let's see a separate project that enables us to fetch data from HBase using SQL commands, which we already know; consider the following taken from http://phoenix.apache.org:

"Apache Phoenix is a SQL skin over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets."

Note

We can configure to enable SQL facility in HBase using the following link, and play with SQL queries on HBase:

http://phoenix.apache.org

A good place to get a list of scripts is https://github.com/search?q=hbase+script&ref=cmdform.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.20.231