Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Using the Hadoop tool or JARs for HBase

In a driver class provided by Hadoop, we can run HBase JAR files utilizing the features of Hadoop and using the following command:

hadoop jar <HBase Jar file path>/hbase-*.jar<program name>

The program names we can use here are:

completebulkload: This is for a bulk data load
copytable: This is to export a table data from the local to peer cluster
export: This is to export data from an HBase table to HDFS as a sequence file
import: This is to import data written by export
importtsv: This is to import data in TSV format to HBase
rowcounter: This is to count rows in an HBase table using MapReduce
verifyrep: This is to compare the data from tables of different clusters

We will discuss the preceding methods in the next chapter, where we will also discuss the backup/restore process. Likewise, we can call the HBase JAR file with Hadoop. The following are the Hadoop tools:

HFile tool: This tool helps us to read an HFile content in text format. We can use it as:
```
hbase org.apache.hadoop.hbase.io.hfile.hfile
```
This is a very useful tool, as hfile is not in human-readable format, and if we need to see the content, this tool fits well.
FSHLog tool: This tool can be used to read WAL files in human-readable format. We can use it as:
```
hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump <hbaselocationlogfile>
```
We can also use it to split log files, as follows:
```
hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --split <hbaselocationlogfile>
```
We have HLogPrettyPrinter, which prints the contents of the HBase log file and WALPlayer to replay WAL log files.
Counting rows or cell efficiently: An inbuilt HBase counter is much slower as it scans through the HBase tables and huge tables take a lot of time. So, if we need to count the number of records or number of cells for a table, we have an option, using which we can do it in less time. This runs the MapReduce task for the same.
Use the following command to count rows as a MapReduce task:
```
hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>
```
The preceding command will show the number of rows in a specified HBase table. For more detailed statistics of records, we can use CellCounter or RowCounter, which we will see next.
A cell counter results in detailed counts; it provides the following once completed:
- The number of rows in the table
- The number of column families across all rows
- The number of qualifiers across all rows
- The number of occurrences of each column family
- The number of occurrences of each qualifier
- The number of versions of each qualifier
We can use CellCounter as follows:
```
hbase org.apache.hadoop.hbase.mapreduce.CellCounter <tablename><outputDir> [regex or prefix]
```
Offline compaction tool: This can be used to run compactions in the offline mode. It can be run as follows:
```
hbase org.apache.hadoop.hbase.regionserver.CompactionTool
```

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Using the Hadoop tool or JARs for HBase

Create new playlist

Sign In

Sign Up

Using the Hadoop tool or JARs for HBase

Table of Contents for
Using the Hadoop tool or JARs for HBase