In a driver class provided by Hadoop, we can run HBase JAR files utilizing the features of Hadoop and using the following command:
hadoop jar <HBase Jar file path>/hbase-*.jar<program name>
The program names we can use here are:
completebulkload
: This is for a bulk data loadcopytable
: This is to export a table data from the local to peer clusterexport
: This is to export data from an HBase table to HDFS as a sequence fileimport
: This is to import data written by export
importtsv
: This is to import data in TSV format to HBaserowcounter
: This is to count rows in an HBase table using MapReduceverifyrep
: This is to compare the data from tables of different clustersWe will discuss the preceding methods in the next chapter, where we will also discuss the backup/restore process. Likewise, we can call the HBase JAR file with Hadoop. The following are the Hadoop tools:
hbase org.apache.hadoop.hbase.io.hfile.hfile
This is a very useful tool, as hfile
is not in human-readable format, and if we need to see the content, this tool fits well.
hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump <hbaselocationlogfile>
We can also use it to split log files, as follows:
hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --split <hbaselocationlogfile>
We have HLogPrettyPrinter
, which prints the contents of the HBase log file and WALPlayer to replay WAL log files.
Use the following command to count rows as a MapReduce task:
hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>
The preceding command will show the number of rows in a specified HBase table. For more detailed statistics of records, we can use CellCounter
or RowCounter
, which we will see next.
A cell counter results in detailed counts; it provides the following once completed:
We can use CellCounter
as follows:
hbase org.apache.hadoop.hbase.mapreduce.CellCounter <tablename><outputDir> [regex or prefix]
hbase org.apache.hadoop.hbase.regionserver.CompactionTool
18.219.228.88