3. Client API: The Basics

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3. Client API: The Basics

This chapter will discuss the client APIs provided by HBase. As noted earlier, HBase is written in Java and so is its native API. This does not mean, though, that you must use Java to access HBase. In fact, Chapter 6 will show how you can use other programming languages.

General Notes

The primary client interface to HBase is the HTable class in the org.apache.hadoop.hbase.client package. It provides the user with all the functionality needed to store and retrieve data from HBase as well as delete obsolete values and so on. Before looking at the various methods this class provides, let us address some general aspects of its usage.

All operations that mutate data are guaranteed to be atomic on a per-row basis. This affects all other concurrent readers and writers of that same row. In other words, it does not matter if another client or thread is reading from or writing to the same row: they either read a consistent last mutation, or may have to wait before being able to apply their change.^[49] More on this in Chapter 8.

Suffice it to say for now that during normal operations and load, a reading client will not be affected by another updating a particular row since their contention is nearly negligible. There is, however, an issue with many clients trying to update the same row at the same time. Try to batch updates together to reduce the number of separate operations on the same row as much as possible.

It also does not matter how many columns are written for the particular row; all of them are covered by this guarantee of atomicity.

Finally, creating HTable instances is not without cost. Each instantiation involves scanning the .META. table to check if the table actually exists and if it is enabled, as well as a few other operations that make this call quite costly. Therefore, it is recommended that you create HTable instances only once—and one per thread—and reuse that instance for the rest of the lifetime of your client application.

As soon as you need multiple instances of HTable, consider using the HTablePool class (see HTablePool), which provides you with a convenient way to reuse multiple instances.

Note

Here is a summary of the points we just discussed:

Create HTable instances only once, usually when your application starts.
Create a separate HTable instance for every thread you execute (or use HTablePool).
Updates are atomic on a per-row basis.

CRUD Operations

The initial set of basic operations are often referred to as CRUD, which stands for create, read, update, and delete. HBase has a set of those and we will look into each of them subsequently. They are provided by the HTable class, and the remainder of this chapter will refer directly to the methods without specifically mentioning the containing class again.

Most of the following operations are often seemingly self-explanatory, but the subtle details warrant a close look. However, this means you will start to see a pattern of repeating functionality so that we do not have to explain them again and again.

Note

The examples you will see in partial source code can be found in full detail in the publicly available GitHub repository at https://github.com/larsgeorge/hbase-book. For details on how to compile them, see Building the Examples.

Initially you will see the import statements, but they will be subsequently omitted for the sake of brevity. Also, specific parts of the code are not listed if they do not immediately help with the topic explained. Refer to the full source if in doubt.

Put Method

This group of operations can be split into separate types: those that work on single rows and those that work on lists of rows. Since the latter involves some more complexity, we will look at each group separately. Along the way, you will also be introduced to accompanying client API features.

Single Puts

The very first method you may want to know about is one that lets you store data in HBase. Here is the call that lets you do that:

void put(Put put) throws IOException

It expects one or a list of Put objects that, in turn, are created with one of these constructors:

Put(byte[] row)
Put(byte[] row, RowLock rowLock)
Put(byte[] row, long ts)
Put(byte[] row, long ts, RowLock rowLock)

You need to supply a row to create a Put instance. A row in HBase is identified by a unique row key and—as is the case with most values in HBase—this is a Java byte[] array. You are free to choose any row key you like, but please also note that Chapter 9 provides a whole section on row key design (see Key Design). For now, we assume this can be anything, and often it represents a fact from the physical world—for example, a username or an order ID. These can be simple numbers but also UUIDs^[50] and so on.

HBase is kind enough to provide us with a helper class that has many static methods to convert Java types into byte[] arrays. Example 3-1 provides a short list of what it offers.

Example 3-1. Methods provided by the Bytes class

static byte[] toBytes(ByteBuffer bb)
static byte[] toBytes(String s)
static byte[] toBytes(boolean b)
static byte[] toBytes(long val)
static byte[] toBytes(float f)
static byte[] toBytes(int val)
...

Once you have created the Put instance you can add data to it. This is done using these methods:

Put add(byte[] family, byte[] qualifier, byte[] value)
Put add(byte[] family, byte[] qualifier, long ts, byte[] value)
Put add(KeyValue kv) throws IOException

Each call to add() specifies exactly one column, or, in combination with an optional timestamp, one single cell. Note that if you do not specify the timestamp with the add() call, the Put instance will use the optional timestamp parameter from the constructor (also called ts) and you should leave it to the region server to set it.

The variant that takes an existing KeyValue instance is for advanced users that have learned how to retrieve, or create, this internal class. It represents a single, unique cell; like a coordination system used with maps it is addressed by the row key, column family, column qualifier, and timestamp, pointing to one value in a three-dimensional, cube-like system—where time is the third dimension.

One way to come across the internal KeyValue type is by using the reverse methods to add(), aptly named get():

List<KeyValue> get(byte[] family, byte[] qualifier)
Map<byte[], List<KeyValue>> getFamilyMap()

These two calls retrieve what you have added earlier, while having converted the unique cells into KeyValue instances. You can retrieve all cells for either an entire column family, a specific column within a family, or everything. The latter is the getFamilyMap() call, which you can then iterate over to check the details contained in each available KeyValue.

Note

Every KeyValue instance contains its full address—the row key, column family, qualifier, timestamp, and so on—as well as the actual data. It is the lowest-level class in HBase with respect to the storage architecture. Storage explains this in great detail. As for the available functionality in regard to the KeyValue class from the client API, see The KeyValue class.

Instead of having to iterate to check for the existence of specific cells, you can use the following set of methods:

boolean has(byte[] family, byte[] qualifier)
boolean has(byte[] family, byte[] qualifier, long ts)
boolean has(byte[] family, byte[] qualifier, byte[] value)
boolean has(byte[] family, byte[] qualifier, long ts, byte[] value)

They increasingly ask for more specific details and return true if a match can be found. The first method simply checks for the presence of a column. The others add the option to check for a timestamp, a given value, or both.

There are more methods provided by the Put class, summarized in Table 3-1.

Note

Note that the getters listed in Table 3-1 for the Put class only retrieve what you have set beforehand. They are rarely used, and make sense only when you, for example, prepare a Put instance in a private method in your code, and inspect the values in another place.

Table 3-1. Quick overview of additional methods provided by the Put class

Method	Description
`getRow()`	Returns the row key as specified when creating the `Put` instance.
`getRowLock()`	Returns the row `RowLock` instance for the current `Put` instance.
`getLockId()`	Returns the optional lock ID handed into the constructor using the `rowLock` parameter. Will be `-1L` if not set.
`setWriteToWAL()`	Allows you to disable the default functionality of writing the data to the server-side write-ahead log.
`getWriteToWAL()`	Indicates if the data will be written to the write-ahead log.
`getTimeStamp()`	Retrieves the associated timestamp of the `Put` instance. Can be optionally set using the constructor’s `ts` parameter. If not set, may return `Long.MAX_VALUE`.
`heapSize()`	Computes the heap space required for the current `Put` instance. This includes all contained data and space needed for internal structures.
`isEmpty()`	Checks if the family map contains any `KeyValue` instances.
`numFamilies()`	Convenience method to retrieve the size of the family map, containing all `KeyValue` instances.
`size()`	Returns the number of `KeyValue` instances that will be added with this `Put`.

Example 3-2 shows how all this is put together (no pun intended) into a basic application.

Note

The examples in this chapter use a very limited, but exact, set of data. When you look at the full source code you will notice that it uses an internal class named HBaseHelper. It is used to create a test table with a very specific number of rows and columns. This makes it much easier to compare the before and after.

Feel free to run the code as-is against a standalone HBase instance on your local machine for testing—or against a fully deployed cluster. Building the Examples explains how to compile the examples. Also, be adventurous and modify them to get a good feel for the functionality they demonstrate.

The example code usually first removes all data from a previous execution by dropping the table it has created. If you run the examples against a production cluster, please make sure that you have no name collisions. Usually the table is testtable to indicate its purpose.

Example 3-2. Application inserting data into HBase

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class PutExample {

  public static void main(String[] args) throws IOException {
    Configuration conf = HBaseConfiguration.create(); 

    HTable table = new HTable(conf, "testtable"); 

    Put put = new Put(Bytes.toBytes("row1")); 

    put.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val1")); 
    put.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"),
      Bytes.toBytes("val2")); 

    table.put(put); 
  }
}

: Create the required configuration.
: Instantiate a new client.
: Create Put with specific row.
: Add a column, whose name is “colfam1:qual1”, to the Put.
: Add another column, whose name is “colfam1:qual2”, to the Put.
: Store the row with the column into the HBase table.

This is a (nearly) full representation of the code used and every line is explained. The following examples will omit more and more of the boilerplate code so that you can focus on the important parts.

Client Configuration introduced the configuration files used by HBase client applications. They need access to the hbase-site.xml file to learn where the cluster resides—or you need to specify this location in your code.

Either way, you need to use an HBaseConfiguration class within your code to handle the configuration properties. This is done using one of the following static methods, provided by that class:

static Configuration create()
static Configuration create(Configuration that)

Example 3-2 is using create() to retrieve a Configuration instance. The second method allows you to hand in an existing configuration to merge with the HBase-specific one.

When you call any of the static create() methods, the code behind it will attempt to load two configuration files, hbase-default.xml and hbase-site.xml, using the current Java classpath.

If you specify an existing configuration, using create(Configuration that), it will take the highest precedence over the configuration files loaded from the classpath.

The HBaseConfiguration class actually extends the Hadoop Configuration class, but is still compatible with it: you could hand in a Hadoop configuration instance and it would be merged just fine.

After you have retrieved an HBaseConfiguration instance, you will have a merged configuration composed of the default values and anything that was overridden in the hbase-site.xml configuration file—and optionally the existing configuration you have handed in. You are then free to modify this configuration in any way you like, before you use it with your HTable instances. For example, you could override the ZooKeeper quorum address, to point to a different cluster:

Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "zk1.foo.com,zk2.foo.com");

In other words, you could simply omit any external, client-side configuration file by setting the quorum property in code. That way, you create a client that needs no extra configuration.

You should share the configuration instance for the reasons explained in Connection Handling.

You can, once again, make use of the command-line shell (see Quick-Start Guide) to verify that our insert has succeeded:

hbase(main):001:0> list
TABLE
testtable
1 row(s) in 0.0400 seconds

hbase(main):002:0> scan 'testtable'
ROW              COLUMN+CELL
row1             column=colfam1:qual1, timestamp=1294065304642, value=val1
1 row(s) in 0.2050 seconds

Another optional parameter while creating a Put instance is called ts, or timestamp. It allows you to store a value at a particular version in the HBase table.

A special feature of HBase is the possibility to store multiple versions of each cell (the value of a particular column). This is achieved by using timestamps for each of the versions and storing them in descending order. Each timestamp is a long integer value measured in milliseconds. It records the time that has passed since midnight, January 1, 1970 UTC—also known as Unix time^[51] or Unix epoch. Most operating systems provide a timer that can be read from programming languages. In Java, for example, you could use the System.currentTimeMillis() function.

When you put a value into HBase, you have the choice of either explicitly providing a timestamp or omitting that value, which in turn is then filled in by the RegionServer when the put operation is performed.

As noted in Requirements, you must make sure your servers have the proper time and are synchronized with one another. Clients might be outside your control, and therefore have a different time, possibly different by hours or sometimes even years.

As long as you do not specify the time in the client API calls, the server time will prevail. But once you allow or have to deal with explicit timestamps, you need to make sure you are not in for unpleasant surprises. Clients could insert values at unexpected timestamps and cause seemingly unordered version histories.

While most applications never worry about versioning and rely on the built-in handling of the timestamps by HBase, you should be aware of a few peculiarities when using them explicitly.

Here is a larger example of inserting multiple versions of a cell and how to retrieve them:

hbase(main):001:0> create 'test', 'cf1'             
0 row(s) in 0.9810 seconds

hbase(main):002:0> put 'test', 'row1', 'cf1', 'val1'
0 row(s) in 0.0720 seconds

hbase(main):003:0> put 'test', 'row1', 'cf1', 'val2'
0 row(s) in 0.0520 seconds

hbase(main):004:0> scan 'test'                      
ROW              COLUMN+CELL
 row1            column=cf1:, timestamp=1297853125623, value=val2
1 row(s) in 0.0790 seconds

hbase(main):005:0> scan 'test', { VERSIONS => 3 }   
ROW              COLUMN+CELL
 row1            column=cf1:, timestamp=1297853125623, value=val2
 row1            column=cf1:, timestamp=1297853122412, value=val1
1 row(s) in 0.0640 seconds

The example creates a table named test with one column family named cf1. Then two put commands are issued with the same row and column key, but two different values: val1 and val2, respectively. Then a scan operation is used to see the full content of the table. You may not be surprised to see only val2, as you could assume you have simply replaced val1 with the second put call.

But that is not the case in HBase. By default, it keeps three versions of a value and you can use this fact to slightly modify the scan operation to get all available values (i.e., versions) instead. The last call in the example lists both versions you have saved. Note how the row key stays the same in the output; you get all cells as separate lines in the shell’s output.

For both operations, scan and get, you only get the latest (also referred to as the newest) version, because HBase saves versions in time descending order and is set to return only one version by default. Adding the maximum version parameter to the calls allows you to retrieve more than one. Set it to the aforementioned Integer.MAX_VALUE and you get all available versions.

The term maximum versions stems from the fact that you may have fewer versions in a particular cell. The example sets VERSIONS (a shortcut for MAX_VERSIONS) to “3”, but since only two are stored, that is all that is shown.

Another option to retrieve more versions is to use the time range parameter these calls expose. They let you specify a start and end time and will retrieve all versions matching the time range. More on this in Get Method and Scans.

There are many more subtle (and not so subtle) issues with versioning and we will discuss them in Read Path, as well as revisit the advanced concepts and nonstandard behavior in Versioning.

When you do not specify that parameter, it is implicitly set to the current time of the RegionServer responsible for the given row at the moment it is added to the underlying storage.

The constructors of the Put class have another optional parameter, called rowLock. It gives you the ability to hand in an external row lock, something discussed in Row Locks. Suffice it to say for now that you can create your own RowLock instance that can be used to prevent other clients from accessing specific rows while you are modifying it repeatedly.

The KeyValue class

From your code you may have to deal with KeyValue instances directly. As you may recall from our discussion earlier in this book, these instances contain the data as well as the coordinates of one specific cell. The coordinates are the row key, name of the column family, column qualifier, and timestamp. The class provides a plethora of constructors that allow you to combine all of these in many variations. The fully specified constructor looks like this:

KeyValue(byte[] row, int roffset, int rlength,
  byte[] family, int foffset, int flength, byte[] qualifier, int qoffset, 
  int qlength, long timestamp, Type type, byte[] value, int voffset, 
  int vlength)

Note

Be advised that the KeyValue class, and its accompanying comparators, are designed for internal use. They are available in a few places in the client API to give you access to the raw data so that extra copy operations can be avoided. They also allow byte-level comparisons, rather than having to rely on a slower, class-level comparison.

The data as well as the coordinates are stored as a Java byte[], that is, as a byte array. The design behind this type of low-level storage is to allow for arbitrary data, but also to be able to efficiently store only the required bytes, keeping the overhead of internal data structures to a minimum. This is also the reason that there is an offset and length parameter for each byte array paremeter. They allow you to pass in existing byte arrays while doing very fast byte-level operations.

For every member of the coordinates, there is a getter that can retrieve the byte arrays and their given offset and length. This also can be accessed at the topmost level, that is, the underlying byte buffer:

byte[] getBuffer()
int getOffset()
int getLength()

They return the full byte array details backing the current KeyValue instance. There will be few occasions where you will ever have to go that far. But it is available and you can make use of it—if need be.

Two very interesting methods to know are:

byte [] getRow()
byte [] getKey()

The question you may ask yourself is: what is the difference between a row and a key? While you will learn about the difference in Storage, for now just remember that the row is what we have been referring to alternatively as the row key, that is, the row parameter of the Put constructor, and the key is what was previously introduced as the coordinates of a cell—in their raw, byte array format. In practice, you hardly ever have to use getKey() but will be more likely to use getRow().

The KeyValue class also provides a large list of internal classes implementing the Comparator interface. They can be used in your own code to do the same comparisons as done inside HBase. This is useful when retrieving KeyValue instances using the API and further sorting or processing them in order. They are listed in Table 3-2.

Table 3-2. Brief overview of comparators provided by the KeyValue class

Comparator	Description
`KeyComparator`	Compares two `KeyValue` keys, i.e., what is returned by the `getKey()` method, in their raw, byte array format.
`KVComparator`	Wraps the raw `KeyComparator`, providing the same functionality based on two given `KeyValue` instances.
`RowComparator`	Compares the row key (returned by `getRow()`) of two `KeyValue` instances.
`MetaKeyComparator`	Compares two keys of `.META.` entries in their raw, byte array format.
`MetaComparator`	Special version of the `KVComparator` class for the entries in the `.META.` catalog table. Wraps the `MetaKeyComparator`.
`RootKeyComparator`	Compares two keys of `-ROOT-` entries in their raw, byte array format.
`RootComparator`	Special version of the `KVComparator` class for the entries in the `-ROOT-` catalog table. Wraps the `RootKeyComparator`.

The KeyValue class exports most of these comparators as a static instance for each class. For example, there is a public field named KEY_COMPARATOR, giving access to a KeyComparator instance. The COMPARATOR field is pointing to an instance of the more frequently used KVComparator class. So instead of creating your own instances, you could use a provided one—for example, when creating a set holding KeyValue instances that should be sorted in the same order that HBase is using internally:

TreeSet<KeyValue> set = 
  new TreeSet<KeyValue>(KeyValue.COMPARATOR)

There is one more field per KeyValue instance that is representing an additional dimension for its unique coordinates: the type. Table 3-3 lists the possible values.

Table 3-3. The possible type values for a given KeyValue instance

Type	Description
`Put`	The `KeyValue` instance represents a normal `Put` operation.
`Delete`	This instance of `KeyValue` represents a `Delete` operation, also known as a tombstone marker.
`DeleteColumn`	This is the same as `Delete`, but more broadly deletes an entire column.
`DeleteFamily`	This is the same as `Delete`, but more broadly deletes an entire column family, including all contained columns.

You can see the type of an existing KeyValue instance by, for example, using another provided call:

String toString()

This prints out the meta information of the current KeyValue instance, and has the following format:

<row-key>/<family>:<qualifier>/<version>/<type>/<value-length>

This is used by some of the example code for this book to check if data has been set or retrieved, and what the meta information is.

The class has many more convenience methods that allow you to compare parts of the stored data, as well as check what type it is, get its computed heap size, clone or copy it, and more. There are static methods to create special instances of KeyValue that can be used for comparisons, or when manipulating data on that low of a level within HBase. You should consult the provided Java documentation to learn more about them.^[52] Also see Storage for a detailed explanation of the raw, binary format.

Client-side write buffer

Each put operation is effectively an RPC^[53] that is transferring data from the client to the server and back. This is OK for a low number of operations, but not for applications that need to store thousands of values per second into a table.

Note

The importance of reducing the number of separate RPC calls is tied to the round-trip time, which is the time it takes for a client to send a request and the server to send a response over the network. This does not include the time required for the data transfer. It simply is the overhead of sending packages over the wire. On average, these take about 1ms on a LAN, which means you can handle 1,000 round-trips per second only.

The other important factor is the message size: if you send large requests over the network, you already need a much lower number of round-trips, as most of the time is spent transferring data. But when doing, for example, counter increments, which are small in size, you will see better performance when batching updates into fewer requests.

The HBase API comes with a built-in client-side write buffer that collects put operations so that they are sent in one RPC call to the server(s). The global switch to control if it is used or not is represented by the following methods:

void setAutoFlush(boolean autoFlush)
boolean isAutoFlush()

By default, the client-side buffer is not enabled. You activate the buffer by setting auto-flush to false, by invoking:

table.setAutoFlush(false)

This will enable the client-side buffering mechanism, and you can check the state of the flag respectively with the isAutoFlush() method. It will return true when you initially create the HTable instance. Otherwise, it will obviously return the current state as set by your code.

Once you have activated the buffer, you can store data into HBase as shown in Single Puts. You do not cause any RPCs to occur, though, because the Put instances you stored are kept in memory in your client process. When you want to force the data to be written, you can call another API function:

void flushCommits() throws IOException

The flushCommits() method ships all the modifications to the remote server(s). The buffered Put instances can span many different rows. The client is smart enough to batch these updates accordingly and send them to the appropriate region server(s). Just as with the single put() call, you do not have to worry about where data resides, as this is handled transparently for you by the HBase client. Figure 3-1 shows how the operations are sorted and grouped before they are shipped over the network, with one single RPC per region server.

Figure 3-1. The client-side puts sorted and grouped by region server

While you can force a flush of the buffer, this is usually not necessary, as the API tracks how much data you are buffering by counting the required heap size of every instance you have added. This tracks the entire overhead of your data, also including necessary internal data structures. Once you go over a specific limit, the client will call the flush command for you implicitly. You can control the configured maximum allowed client-side write buffer size with these calls:

long getWriteBufferSize()
void setWriteBufferSize(long writeBufferSize) throws IOException

The default size is a moderate 2 MB (or 2,097,152 bytes) and assumes you are inserting reasonably small records into HBase, that is, each a fraction of that buffer size. If you were to store larger data, you may want to consider increasing this value to allow your client to efficiently group together a certain number of records per RPC.

Note

Setting this value for every HTable instance you create may seem cumbersome and can be avoided by adding a higher value to your local hbase-site.xml configuration file—for example, adding:

<property>
  <name>hbase.client.write.buffer</name>
  <value>20971520</value>
</property>

This will increase the limit to 20 MB.

The buffer is only ever flushed on two occasions:

Explicit flush

Use the flushCommits() call to send the data to the servers for permanent storage.

Implicit flush

This is triggered when you call put() or setWriteBufferSize(). Both calls compare the currently used buffer size with the configured limit and optionally invoke the flushCommits() method. In case the entire buffer is disabled, setting setAutoFlush(true) will force the client to call the flush method for every invocation of put().

Another call triggering the flush implicitly and unconditionally is the close() method of HTable.

Example 3-3 shows how the write buffer is controlled from the client API.

Example 3-3. Using the client-side write buffer

    HTable table = new HTable(conf, "testtable");
    System.out.println("Auto flush: " + table.isAutoFlush());  

    table.setAutoFlush(false); 

    Put put1 = new Put(Bytes.toBytes("row1"));
    put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val1"));
    table.put(put1); 

    Put put2 = new Put(Bytes.toBytes("row2"));
    put2.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val2"));
    table.put(put2);

    Put put3 = new Put(Bytes.toBytes("row3"));
    put3.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val3"));
    table.put(put3);

    Get get = new Get(Bytes.toBytes("row1"));
    Result res1 = table.get(get);
    System.out.println("Result: " + res1); 

    table.flushCommits(); 

    Result res2 = table.get(get);
    System.out.println("Result: " + res2);

: Check what the auto flush flag is set to; should print “Auto flush: true”.
: Set the auto flush to false to enable the client-side write buffer.
: Store some rows with columns into HBase.
: Try to load previously stored row. This will print “Result: keyvalues=NONE”.
: Force a flush. This causes an RPC to occur.
: Now the row is persisted and can be loaded.

This example also shows a specific behavior of the buffer that you may not anticipate. Let’s see what it prints out when executed:

Auto flush: true
Result: keyvalues=NONE
Result: keyvalues={row1/colfam1:qual1/1300267114099/Put/vlen=4}

While you have not seen the get() operation yet, you should still be able to correctly infer what it does, that is, reading data back from the servers. But for the first get() in the example, the API returns a NONE value—what does that mean? It is caused by the fact that the client write buffer is an in-memory structure that is literally holding back any unflushed records. Nothing was sent to the servers yet, and therefore you cannot access it.

Note

If you were ever required to access the write buffer content, you would find that ArrayList<Put> getWriteBuffer() can be used to get the internal list of buffered Put instances you have added so far calling table.put(put).

I mentioned earlier that it is exactly that list that makes HTable not safe for multithreaded use. Be very careful with what you do to that list when accessing it directly. You are bypassing the heap size checks, or you might modify it while a flush is in progress!

Warning

Since the client buffer is a simple list retained in the local process memory, you need to be careful not to run into a problem that terminates the process mid-flight. If that were to happen, any data that has not yet been flushed will be lost! The servers will have never received that data, and therefore there will be no copy of it that can be used to recover from this situation.

Also note that a bigger buffer takes more memory—on both the client and server side since the server instantiates the passed write buffer to process it. On the other hand, a larger buffer size reduces the number of RPCs made. For an estimate of server-side memory-used, evaluate hbase.client.write.buffer * hbase.regionserver.handler.count * number of region server.

Referring to the round-trip time again, if you only store large cells, the local buffer is less useful, since the transfer is then dominated by the transfer time. In this case, you are better advised to not increase the client buffer size.

List of Puts

The client API has the ability to insert single Put instances as shown earlier, but it also has the advanced feature of batching operations together. This comes in the form of the following call:

void put(List<Put> puts) throws IOException

You will have to create a list of Put instances and hand it to this call. Example 3-4 updates the previous example by creating a list to hold the mutations and eventually calling the list-based put() method.

Example 3-4. Inserting data into HBase using a list

    List<Put> puts = new ArrayList<Put>(); 

    Put put1 = new Put(Bytes.toBytes("row1"));
    put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val1"));
    puts.add(put1); 

    Put put2 = new Put(Bytes.toBytes("row2"));
    put2.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val2"));
    puts.add(put2); 

    Put put3 = new Put(Bytes.toBytes("row2"));
    put3.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"),
      Bytes.toBytes("val3"));
    puts.add(put3); 

    table.put(puts);

: Create a list that holds the Put instances.
: Add a Put to the list.
: Add another Put to the list.
: Add a third Put to the list.
: Store multiple rows with columns into HBase.

A quick check with the HBase Shell reveals that the rows were stored as expected. Note that the example actually modified three columns, but in two rows only. It added two columns into the row with the key row2, using two separate qualifiers, qual1 and qual2, creating two uniquely named columns in the same row.

hbase(main):001:0> scan 'testtable'
ROW              COLUMN+CELL
 row1            column=colfam1:qual1, timestamp=1300108258094, value=val1
 row2            column=colfam1:qual1, timestamp=1300108258094, value=val2
 row2            column=colfam1:qual2, timestamp=1300108258098, value=val3
2 row(s) in 0.1590 seconds

Since you are issuing a list of row mutations to possibly many different rows, there is a chance that not all of them will succeed. This could be due to a few reasons—for example, when there is an issue with one of the region servers and the client-side retry mechanism needs to give up because the number of retries has exceeded the configured maximum. If there is problem with any of the put calls on the remote servers, the error is reported back to you subsequently in the form of an IOException.

Example 3-5 uses a bogus column family name to insert a column. Since the client is not aware of the structure of the remote table—it could have been altered since it was created—this check is done on the server side.

Example 3-5. Inserting a faulty column family into HBase

    Put put1 = new Put(Bytes.toBytes("row1"));
    put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val1"));
    puts.add(put1);
    Put put2 = new Put(Bytes.toBytes("row2"));
    put2.add(Bytes.toBytes("BOGUS"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val2")); 
    puts.add(put2);
    Put put3 = new Put(Bytes.toBytes("row2"));
    put3.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"),
      Bytes.toBytes("val3"));
    puts.add(put3);

    table.put(puts);

: Add a Put with a nonexistent family to the list.
: Store multiple rows with columns into HBase.

The call to put() fails with the following (or similar) error message:

org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
 Failed 1 action: NoSuchColumnFamilyException: 1 time, 
 servers with issues: 10.0.0.57:51640,

You may wonder what happened to the other, nonfaulty puts in the list. Using the shell again you should see that the two correct puts have been applied:

hbase(main):001:0> scan 'testtable'
ROW              COLUMN+CELL
 row1            column=colfam1:qual1, timestamp=1300108925848, value=val1
 row2            column=colfam1:qual2, timestamp=1300108925848, value=val3
2 row(s) in 0.0640 seconds

The servers iterate over all operations and try to apply them. The failed ones are returned and the client reports the remote error using the RetriesExhaustedWithDetailsException, giving you insight into how many operations have failed, with what error, and how many times it has retried to apply the erroneous modification. It is interesting to note that, for the bogus column family, the retry is automatically set to 1 (see the NoSuchColumnFamilyException: 1 time), as this is an error from which HBase cannot recover.

Those Put instances that have failed on the server side are kept in the local write buffer. They will be retried the next time the buffer is flushed. You can also access them using the getWriteBuffer() method of HTable and take, for example, evasive actions.

Some checks are done on the client side, though—for example, to ensure that the put has a column specified or that it is completely empty. In that event, the client is throwing an exception that leaves the operations preceding the faulty one in the client buffer.

Warning

The list-based put() call uses the client-side write buffer to insert all puts into the local buffer and then to call flushCache() implicitly. While inserting each instance of Put, the client API performs the mentioned check. If it fails, for example, at the third put out of five—the first two are added to the buffer while the last two are not. It also then does not trigger the flush command at all.

You could catch the exception and flush the write buffer manually to apply those modifications. Example 3-6 shows one approach to handle this.

Example 3-6. Inserting an empty Put instance into HBase

    Put put1 = new Put(Bytes.toBytes("row1"));
    put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val1"));
    puts.add(put1);
    Put put2 = new Put(Bytes.toBytes("row2"));
    put2.add(Bytes.toBytes("BOGUS"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val2"));
    puts.add(put2);
    Put put3 = new Put(Bytes.toBytes("row2"));
    put3.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"),
      Bytes.toBytes("val3"));
    puts.add(put3);
    Put put4 = new Put(Bytes.toBytes("row2"));
    puts.add(put4); 

    try {
      table.put(puts);
    } catch (Exception e) {
      System.err.println("Error: " + e);
      table.flushCommits(); 
    }

: Add a put with no content at all to the list.
: Catch a local exception and commit queued updates.

The example code this time should give you two errors, similar to:

Error: java.lang.IllegalArgumentException: No columns to insert
Exception in thread "main" 
 org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: 
 Failed 1 action: NoSuchColumnFamilyException: 1 time, 
 servers with issues: 10.0.0.57:51640,

The first Error is the client-side check, while the second is the remote exception that now is caused by calling

table.flushCommits()

in the try/catch block.

Warning

Since you possibly have the client-side write buffer enabled—refer to Client-side write buffer—you will find that the exception is not reported right away, but is delayed until the buffer is flushed.

You need to watch out for a peculiarity using the list-based put call: you cannot control the order in which the puts are applied on the server side, which implies that the order in which the servers are called is also not under your control. Use this call with caution if you have to guarantee a specific order—in the worst case, you need to create smaller batches and explicitly flush the client-side write cache to enforce that they are sent to the remote servers.

Atomic compare-and-set

There is a special variation of the put calls that warrants its own section: check and put. The method signature is:

boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier,
  byte[] value, Put put) throws IOException

This call allows you to issue atomic, server-side mutations that are guarded by an accompanying check. If the check passes successfully, the put operation is executed; otherwise, it aborts the operation completely. It can be used to update data based on current, possibly related, values.

Such guarded operations are often used in systems that handle, for example, account balances, state transitions, or data processing. The basic principle is that you read data at one point in time and process it. Once you are ready to write back the result, you want to make sure that no other client has done the same already. You use the atomic check to compare that the value is not modified and therefore apply your value.

Note

A special type of check can be performed using the checkAndPut() call: only update if another value is not already present. This is achieved by setting the value parameter to null. In that case, the operation would succeed when the specified column is nonexistent.

The call returns a boolean result value, indicating whether the Put has been applied or not, returning true or false, respectively. Example 3-7 shows the interactions between the client and the server, returning the expected results.

Example 3-7. Application using the atomic compare-and-set operations

    Put put1 = new Put(Bytes.toBytes("row1"));
    put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val1")); 

    boolean res1 = table.checkAndPut(Bytes.toBytes("row1"),
      Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), null, put1); 
    System.out.println("Put applied: " + res1); 

    boolean res2 = table.checkAndPut(Bytes.toBytes("row1"),
      Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), null, put1); 
    System.out.println("Put applied: " + res2); 

    Put put2 = new Put(Bytes.toBytes("row1"));
    put2.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"),
      Bytes.toBytes("val2")); 

    boolean res3 = table.checkAndPut(Bytes.toBytes("row1"),
      Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), 
      Bytes.toBytes("val1"), put2);
    System.out.println("Put applied: " + res3); 

    Put put3 = new Put(Bytes.toBytes("row2"));
    put3.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"),
      Bytes.toBytes("val3")); 

    boolean res4 = table.checkAndPut(Bytes.toBytes("row1"),
      Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), 
      Bytes.toBytes("val1"), put3);
    System.out.println("Put applied: " + res4);

: Create a new Put instance.
: Check if the column does not exist and perform an optional put operation.
: Print out the result; it should be “Put applied: true.”
: Attempt to store the same cell again.
: Print out the result; it should be “Put applied: false”, as the column now already exists.
: Create another Put instance, but using a different column qualifier.
: Store new data only if the previous data has been saved.
: Print out the result; it should be “Put applied: true”, as the checked column already exists.
: Create yet another Put instance, but using a different row.
: Store new data while checking a different row.
: We will not get here, as an exception is thrown beforehand!

The last call in the example will throw the following error:

Exception in thread "main" org.apache.hadoop.hbase.DoNotRetryIOException: 
 Action's getRow must match the passed row

Warning

The compare-and-set operations provided by HBase rely on checking and modifying the same row! As with other operations only providing atomicity guarantees on single rows, this also applies for this call. Trying to check and modify two different rows will return an exception.

Compare-and-set (CAS) operations are very powerful, especially in distributed systems, with even more decoupled client processes. In providing these calls, HBase sets itself apart from other architectures that give no means to reason about concurrent updates performed by multiple, independent clients.

Get Method

The next step in a client API is to retrieve what was just saved. For that the HTable is providing you with the Get call and matching classes. The operations are split into those that operate on a single row and those that retrieve multiple rows in one call.

Single Gets

First, the method that is used to retrieve specific values from an HBase table:

Result get(Get get) throws IOException

Similar to the Put class for the put() call, there is a matching Get class used by the aforementioned get() function. As another similarity, you will have to provide a row key when creating an instance of Get, using one of these constructors:

Get(byte[] row)
Get(byte[] row, RowLock rowLock)

Note

A get() operation is bound to one specific row, but can retrieve any number of columns and/or cells contained therein.

Each constructor takes a row parameter specifying the row you want to access, while the second constructor adds an optional rowLock parameter, allowing you to hand in your own locks. And, similar to the put operations, you have methods to specify rather broad criteria to find what you are looking for—or to specify everything down to exact coordinates for a single cell:

Get addFamily(byte[] family)
Get addColumn(byte[] family, byte[] qualifier)
Get setTimeRange(long minStamp, long maxStamp) throws IOException
Get setTimeStamp(long timestamp)
Get setMaxVersions()
Get setMaxVersions(int maxVersions) throws IOException

The addFamily() call narrows the request down to the given column family. It can be called multiple times to add more than one family. The same is true for the addColumn() call. Here you can add an even narrower address space: the specific column. Then there are methods that let you set the exact timestamp you are looking for—or a time range to match those cells that fall inside it.

Lastly, there are methods that allow you to specify how many versions you want to retrieve, given that you have not set an exact timestamp. By default, this is set to 1, meaning that the get() call returns the most current match only. If you are in doubt, use getMaxVersions() to check what it is set to. The setMaxVersions() without a parameter sets the number of versions to return to Integer.MAX_VALUE—which is also the maximum number of versions you can configure in the column family descriptor, and therefore tells the API to return every available version of all matching cells (in other words, up to what is set at the column family level).

The Get class provides additional calls, which are listed in Table 3-4 for your perusal.

Table 3-4. Quick overview of additional methods provided by the Get class

Method	Description
`getRow()`	Returns the row key as specified when creating the `Get` instance.
`getRowLock()`	Returns the row `RowLock` instance for the current `Get` instance.
`getLockId()`	Returns the optional lock ID handed into the constructor using the `rowLock` parameter. Will be `-1L` if not set.
`getTimeRange()`	Retrieves the associated timestamp or time range of the `Get` instance. Note that there is no `getTimeStamp()` since the API converts a value assigned with `setTimeStamp()` into a `TimeRange` instance internally, setting the minimum and maximum values to the given timestamp.
`setFilter()/getFilter()`	Special filter instances can be used to select certain columns or cells, based on a wide variety of conditions. You can get and set them with these methods. See Filters for details.
`setCacheBlocks()/getCacheBlocks()`	Each HBase region server has a block cache that efficiently retains recently accessed data for subsequent reads of contiguous information. In some events it is better to not engage the cache to avoid too much churn when doing completely random gets. These methods give you control over this feature.
`numFamilies()`	Convenience method to retrieve the size of the family map, containing the families added using the `addFamily()` or `addColumn()` calls.
`hasFamilies()`	Another helper to check if a family—or column—has been added to the current instance of the `Get` class.
`familySet()/getFamilyMap()`	These methods give you access to the column families and specific columns, as added by the `addFamily()` and/or `addColumn()` calls. The family map is a map where the key is the family name and the value a list of added column qualifiers for this particular family. The `familySet()` returns the `Set` of all stored families, i.e., a set containing only the family names.

Note

The getters listed in Table 3-4 for the Get class only retrieve what you have set beforehand. They are rarely used, and make sense only when you, for example, prepare a Get instance in a private method in your code, and inspect the values in another place.

As mentioned earlier, HBase provides us with a helper class named Bytes that has many static methods to convert Java types into byte[] arrays. It also can do the same in reverse: as you are retrieving data from HBase—for example, one of the rows stored previously—you can make use of these helper functions to convert the byte[] data back into Java types. Here is a short list of what it offers, continued from the earlier discussion:

static String toString(byte[] b)
static boolean toBoolean(byte[] b)
static long toLong(byte[] bytes)
static float toFloat(byte[] bytes)
static int toInt(byte[] bytes)
...

Example 3-8 shows how this is all put together.

Example 3-8. Application retrieving data from HBase

    Configuration conf = HBaseConfiguration.create(); 
    HTable table = new HTable(conf, "testtable"); 
    Get get = new Get(Bytes.toBytes("row1")); 
    get.addColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1")); 
    Result result = table.get(get); 
    byte[] val = result.getValue(Bytes.toBytes("colfam1"),
      Bytes.toBytes("qual1")); 
    System.out.println("Value: " + Bytes.toString(val));

: Create the configuration.
: Instantiate a new table reference.
: Create a Get with a specific row.
: Add a column to the Get.
: Retrieve a row with selected columns from HBase.
: Get a specific value for the given column.
: Print out the value while converting it back.

If you are running this example after, say Example 3-2, you should get this as the output:

Value: val1

The output is not very spectacular, but it shows that the basic operation works. The example also only adds the specific column to retrieve, relying on the default for maximum versions being returned set to 1. The call to get() returns an instance of the Result class, which you will learn about next.

The Result class

When you retrieve data using the get() calls, you receive an instance of the Result class that contains all the matching cells. It provides you with the means to access everything that was returned from the server for the given row and matching the specified query, such as column family, column qualifier, timestamp, and so on.

There are utility methods you can use to ask for specific results—just as Example 3-8 used earlier—using more concrete dimensions. If you have, for example, asked the server to return all columns of one specific column family, you can now ask for specific columns within that family. In other words, you need to call get() with just enough concrete information to be able to process the matching data on the client side. The functions provided are:

byte[] getValue(byte[] family, byte[] qualifier)
byte[] value()
byte[] getRow()
int size()
boolean isEmpty()
KeyValue[] raw()
List<KeyValue> list()

The getValue() call allows you to get the data for a specific cell stored in HBase. As you cannot specify what timestamp—in other words, version—you want, you get the newest one. The value() call makes this even easier by returning the data for the newest cell in the first column found. Since columns are also sorted lexicographically on the server, this would return the value of the column with the column name (including family and qualifier) sorted first.

You saw getRow() before: it returns the row key, as specified when creating the current instance of the Get class. size() is returning the number of KeyValue instances the server has returned. You may use this call—or isEmpty(), which checks if size() returns a number greater than zero—to check in your own client code if the retrieval call returned any matches.

Access to the raw, low-level KeyValue instances is provided by the raw() method, returning the array of KeyValue instances backing the current Result instance. The list() call simply converts the array returned by raw() into a List instance, giving you convenience by providing iterator access, for example. The created list is backed by the original array of KeyValue instances.

Note

The array returned by raw() is already lexicographically sorted, taking the full coordinates of the KeyValue instances into account. So it is sorted first by column family, then within each family by qualifier, then by timestamp, and finally by type.

Another set of accessors is provided which are more column-oriented:

List<KeyValue> getColumn(byte[] family, byte[] qualifier)
KeyValue getColumnLatest(byte[] family, byte[] qualifier)
boolean containsColumn(byte[] family, byte[] qualifier)

Here you ask for multiple values of a specific column, which solves the issue pointed out earlier, that is, how to get multiple versions of a given column. The number returned obviously is bound to the maximum number of versions you have specified when configuring the Get instance, before the call to get(), with the default being set to 1. In other words, the returned list contains zero (in case the column has no value for the given row) or one entry, which is the newest version of the value. If you have specified a value greater than the default of 1 version to be returned, it could be any number, up to the specified maximum.

The getColumnLatest() method is returning the newest cell of the specified column, but in contrast to getValue(), it does not return the raw byte array of the value but the full KeyValue instance instead. This may be useful when you need more than just the data. The containsColumn() is a convenience method to check if there was any cell returned in the specified column.

Note

These methods all support the fact that the qualifier can be left unspecified—setting it to null—and therefore matching the special column with no name.

Using no qualifier means that there is no label to the column. When looking at the table from, for example, the HBase Shell, you need to know what it contains. A rare case where you might want to consider using the empty qualifier is in column families that only ever contain a single column. Then the family name might indicate its purpose.

There is a third set of methods that provide access to the returned data from the get request. These are map-oriented and look like this:

NavigableMap<byte[], NavigableMap<byte[], 
  NavigableMap<Long, byte[]>>> getMap()
NavigableMap<byte[], 
  NavigableMap<byte[], byte[]>> getNoVersionMap()
NavigableMap<byte[], byte[]> getFamilyMap(byte[] family)

The most generic call, named getMap(), returns the entire result set in a Java Map class instance that you can iterate over to access all the values. The getNoVersionMap() does the same while only including the latest cell for each column. Finally, the getFamilyMap() lets you select the KeyValue instances for a specific column family only—but including all versions, if specified.

Use whichever access method of Result matches your access pattern; the data has already been moved across the network from the server to your client process, so it is not incurring any other performance or resource penalties.

All Java objects have a toString() method, which, when overridden by a class, can be used to convert the data of an instance into a text representation. This is not for serialization purposes, but is most often used for debugging.

The Result class has such an implementation of toString(), dumping the result of a read call as a string. The output looks like this:

keyvalues={row-2/colfam1:col-5/1300802024293/Put/vlen=7, 
           row-2/colfam2:col-33/1300802024325/Put/vlen=8}

It simply prints all contained KeyValue instances, that is, calling KeyValue.toString() respectively. If the Result instance is empty, the output will be:

keyvalues=NONE

This indicates that there were no KeyValue instances returned. The code examples in this book make use of the toString() method to quickly print the results of previous read operations.

List of Gets

Another similarity to the put() calls is that you can ask for more than one row using a single request. This allows you to quickly and efficiently retrieve related—but also completely random, if required—data from the remote servers.

Note

As shown in Figure 3-1, the request may actually go to more than one server, but for all intents and purposes, it looks like a single call from the client code.

The method provided by the API has the following signature:

Result[] get(List<Get> gets) throws IOException

Using this call is straightforward, with the same approach as seen earlier: you need to create a list that holds all instances of the Get class you have prepared. This list is handed into the call and you will be returned an array of equal size holding the matching Result instances. Example 3-9 brings this together, showing two different approaches to accessing the data.

Example 3-9. Retrieving data from HBase using lists of Get instances

    byte[] cf1 = Bytes.toBytes("colfam1");
    byte[] qf1 = Bytes.toBytes("qual1");
    byte[] qf2 = Bytes.toBytes("qual2"); 
    byte[] row1 = Bytes.toBytes("row1");
    byte[] row2 = Bytes.toBytes("row2");

    List<Get> gets = new ArrayList<Get>();  

    Get get1 = new Get(row1);
    get1.addColumn(cf1, qf1);
    gets.add(get1);

    Get get2 = new Get(row2);
    get2.addColumn(cf1, qf1); 
    gets.add(get2);

    Get get3 = new Get(row2);
    get3.addColumn(cf1, qf2);
    gets.add(get3);

    Result[] results = table.get(gets); 

    System.out.println("First iteration...");
    for (Result result : results) {
      String row = Bytes.toString(result.getRow());
      System.out.print("Row: " + row + " ");
      byte[] val = null;
      if (result.containsColumn(cf1, qf1)) { 
        val = result.getValue(cf1, qf1);
        System.out.println("Value: " + Bytes.toString(val));
      }
      if (result.containsColumn(cf1, qf2)) {
        val = result.getValue(cf1, qf2);
        System.out.println("Value: " + Bytes.toString(val));
      }
    }

    System.out.println("Second iteration...");
    for (Result result : results) {
      for (KeyValue kv : result.raw()) {
        System.out.println("Row: " + Bytes.toString(kv.getRow()) + 
          " Value: " + Bytes.toString(kv.getValue()));
      }
    }

: Prepare commonly used byte arrays.
: Create a list that holds the Get instances.
: Add the Get instances to the list.
: Retrieve rows with selected columns from HBase.
: Iterate over the results and check what values are available.
: Iterate over the results again, printing out all values.

Assuming that you execute Example 3-4 just before you run Example 3-9, you should see something like this on the command line:

First iteration...
Row: row1 Value: val1
Row: row2 Value: val2
Row: row2 Value: val3
Second iteration...
Row: row1 Value: val1
Row: row2 Value: val2
Row: row2 Value: val3

Both iterations return the same values, showing that you have a number of choices on how to access them, once you have received the results. What you have not yet seen is how errors are reported back to you. This differs from what you learned in List of Puts. The get() call either returns the said array, matching the same size as the given list by the gets parameter, or throws an exception. Example 3-10 showcases this behavior.

Example 3-10. Trying to read an erroneous column family

    List<Get> gets = new ArrayList<Get>();

    Get get1 = new Get(row1);
    get1.addColumn(cf1, qf1);
    gets.add(get1);

    Get get2 = new Get(row2);
    get2.addColumn(cf1, qf1); 
    gets.add(get2);

    Get get3 = new Get(row2);
    get3.addColumn(cf1, qf2);
    gets.add(get3);

    Get get4 = new Get(row2);
    get4.addColumn(Bytes.toBytes("BOGUS"), qf2);
    gets.add(get4); 

    Result[] results = table.get(gets); 

    System.out.println("Result count: " + results.length);

: Add the Get instances to the list.
: Add the bogus column family Get.
: An exception is thrown and the process is aborted.
: This line will never be reached!

Executing this example will abort the entire get() operation, throwing the following (or similar) error, and not returning a result at all:

org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: 
 Failed 1 action: NoSuchColumnFamilyException: 1 time, 
 servers with issues: 10.0.0.57:51640,

One way to have more control over how the API handles partial faults is to use the batch() operations discussed in Batch Operations.

Related retrieval methods

There are a few more calls that you can use from your code to retrieve or check your stored data. The first is:

boolean exists(Get get) throws IOException

You can set up a Get instance, just like you do when using the get() calls of HTable. Instead of having to retrieve the data from the remote servers, using an RPC, to verify that it actually exists, you can employ this call because it only returns a boolean flag indicating that same fact.

Note

Using exists() involves the same lookup semantics on the region servers, including loading file blocks to check if a row or column actually exists. You only avoid shipping the data over the network—but that is very useful if you are checking very large columns, or do so very frequently.

Sometimes it might be necessary to find a specific row, or the one just before the requested row, when retrieving data. The following call can help you find a row using these semantics:

Result getRowOrBefore(byte[] row, byte[] family) throws IOException

You need to specify the row you are looking for, and a column family. The latter is required because, in HBase, which is a column-oriented database, there is no row if there are no columns. Specifying a family name tells the servers to check if the row searched for has any values in a column contained in the given family.

Warning

Be careful to specify an existing column family name when using the getRowOrBefore() method, or you will get a Java NullPointerException back from the server. This is caused by the server trying to access a nonexistent storage file.

The returned instance of the Result class can be used to retrieve the found row key. This should be either the exact row you were asking for, or the one preceding it. If there is no match at all, the call returns null. Example 3-11 uses the call to find the rows you created using the put examples earlier.

Example 3-11. Using a special retrieval method

    Result result1 = table.getRowOrBefore(Bytes.toBytes("row1"), 
      Bytes.toBytes("colfam1"));
    System.out.println("Found: " + Bytes.toString(result1.getRow())); 

    Result result2 = table.getRowOrBefore(Bytes.toBytes("row99"), 
      Bytes.toBytes("colfam1"));
    System.out.println("Found: " + Bytes.toString(result2.getRow())); 

    for (KeyValue kv : result2.raw()) {
      System.out.println("  Col: " + Bytes.toString(kv.getFamily()) + 
        "/" + Bytes.toString(kv.getQualifier()) +
        ", Value: " + Bytes.toString(kv.getValue()));
    }

    Result result3 = table.getRowOrBefore(Bytes.toBytes("abc"), 
      Bytes.toBytes("colfam1"));
    System.out.println("Found: " + result3);

: Attempt to find an existing row.
: Print what was found.
: Attempt to find a nonexistent row.
: Returns the row that was sorted at the end of the table.
: Print the returned values.
: Attempt to find a row before the test rows.
: Should return “null” since there is no match.

Assuming you ran Example 3-4 just before this code, you should see output similar or equal to the following:

Found: row1
Found: row2
  Col: colfam1/qual1, Value: val2
  Col: colfam1/qual2, Value: val3
Found: null

The first call tries to find a matching row and succeeds. The second call uses a large number postfix to find the last stored row, starting with the prefix row-. It did find row-2 accordingly. Lastly, the example tries to find row abc, which sorts before the rows the put example added, using the row- prefix, and therefore does not exist, nor matches any previous row keys. The returned result is then null and indicates the missed lookup.

What is interesting is the loop to print out the data that was returned along with the matching row. You can see from the preceding code that all columns of the specified column family were returned, including their latest values. You could use this call to quickly retrieve all the latest values from an entire column family—in other words, all columns contained in the given column family—based on a specific sorting pattern. For example, assume our put() example, which is using row- as the prefix for all keys. Calling getRowOrBefore() with a row set to row-999999999 will always return the row that is, based on the lexicographical sorting, placed at the end of the table.

Delete Method

You are now able to create, read, and update data in HBase tables. What is left is the ability to delete from it. And surely you may have guessed by now that the HTable provides you with a method of exactly that name, along with a matching class aptly named Delete.

Single Deletes

The variant of the delete() call that takes a single Delete instance is:

void delete(Delete delete) throws IOException

Just as with the get() and put() calls you saw already, you will have to create a Delete instance and then add details about the data you want to remove. The constructors are:

Delete(byte[] row)
Delete(byte[] row, long timestamp, RowLock rowLock)

You need to provide the row you want to modify, and optionally provide a rowLock, an instance of RowLock to specify your own lock details, in case you want to modify the same row more than once subsequently. Otherwise, you would be wise to narrow down what you want to remove from the given row, using one of the following methods:

Delete deleteFamily(byte[] family)
Delete deleteFamily(byte[] family, long timestamp)
Delete deleteColumns(byte[] family, byte[] qualifier)
Delete deleteColumns(byte[] family, byte[] qualifier, long timestamp)
Delete deleteColumn(byte[] family, byte[] qualifier)
Delete deleteColumn(byte[] family, byte[] qualifier, long timestamp)
void setTimestamp(long timestamp)

You do have a choice to narrow in on what to remove using four types of calls. First you can use the deleteFamily() methods to remove an entire column family, including all contained columns. You have the option to specify a timestamp that triggers more specific filtering of cell versions. If specified, the timestamp matches the same and all older versions of all columns.

The next type is deleteColumns(), which operates on exactly one column and deletes either all versions of that cell when no timestamp is given, or all matching and older versions when a timestamp is specified.

The third type is similar, using deleteColumn(). It also operates on a specific, given column only, but deletes either the most current or the specified version, that is, the one with the matching timestamp.

Finally, there is setTimestamp(), which is not considered when using any of the other three types of calls. But if you do not specify either a family or a column, this call can make the difference between deleting the entire row or just all contained columns, in all column families, that match or have an older timestamp compared to the given one. Table 3-5 shows the functionality in a matrix to make the semantics more readable.

Table 3-5. Functionality matrix of the delete() calls

Method	Deletes without timestamp	Deletes with timestamp
`none`	Entire row, i.e., all columns, all versions.	All versions of all columns in all column families, whose timestamp is equal to or older than the given timestamp.
`deleteColumn()`	Only the latest version of the given column; older versions are kept.	Only exactly the specified version of the given column, with the matching timestamp. If nonexistent, nothing is deleted.
`deleteColumns()`	All versions of the given column.	Versions equal to or older than the given timestamp of the given column.
`deleteFamily()`	All columns (including all versions) of the given family.	Versions equal to or older than the given timestamp of all columns of the given family.

The Delete class provides additional calls, which are listed in Table 3-6 for your reference.

Table 3-6. Quick overview of additional methods provided by the Delete class

Method	Description
`getRow()`	Returns the row key as specified when creating the `Delete` instance.
`getRowLock()`	Returns the row `RowLock` instance for the current `Delete` instance.
`getLockId()`	Returns the optional lock ID handed into the constructor using the `rowLock` parameter. Will be `-1L` if not set.
`getTimeStamp()`	Retrieves the associated timestamp of the `Delete` instance.
`isEmpty()`	Checks if the family map contains any entries. In other words, if you specified any column family, or column qualifier, that should be deleted.
`getFamilyMap()`	Gives you access to the added column families and specific columns, as added by the `deleteFamily()` and/or `deleteColumn()/deleteColumns()` calls. The returned map uses the family name as the key, and the value it points to is a list of added column qualifiers for this particular family.

Example 3-12 shows how to use the single delete() call from client code.

Example 3-12. Application deleting data from HBase

    Delete delete = new Delete(Bytes.toBytes("row1")); 

    delete.setTimestamp(1); 


    delete.deleteColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), 1); 

    delete.deleteColumns(Bytes.toBytes("colfam2"), Bytes.toBytes("qual1")); 
    delete.deleteColumns(Bytes.toBytes("colfam2"), Bytes.toBytes("qual3"), 15); 

    delete.deleteFamily(Bytes.toBytes("colfam3")); 
    delete.deleteFamily(Bytes.toBytes("colfam3"), 3); 

    table.delete(delete); 

    table.close();

: Create a Delete with a specific row.
: Set a timestamp for row deletes.
: Delete a specific version in one column.
: Delete all versions in one column.
: Delete the given and all older versions in one column.
: Delete the entire family, all columns and versions.
: Delete the given and all older versions in the entire column family, that is, from all columns therein.
: Delete the data from the HBase table.

The example lists all the different calls you can use to parameterize the delete() operation. It does not make too much sense to call them all one after another like this. Feel free to comment out the various delete calls to see what is printed on the console.

Setting the timestamp for the deletes has the effect of only matching the exact cell, that is, the matching column and value with the exact timestamp. On the other hand, not setting the timestamp forces the server to retrieve the latest timestamp on the server side on your behalf. This is slower than performing a delete with an explicit timestamp.

If you attempt to delete a cell with a timestamp that does not exist, nothing happens. For example, given that you have two versions of a column, one at version 10 and one at version 20, deleting from this column with version 15 will not affect either existing version.

Another note to be made about the example is that it showcases custom versioning. Instead of relying on timestamps, implicit or explicit ones, it uses sequential numbers, starting with 1. This is perfectly valid, although you are forced to always set the version yourself, since the servers do not know about your schema and would use epoch-based timestamps instead.

Warning

As of this writing, using custom versioning is not recommended. It will very likely work, but is not tested very well. Make sure you carefully evaluate your options before using this technique.

Another example of using custom versioning can be found in Search Integration.

List of Deletes

The list-based delete() call works very similarly to the list-based put(). You need to create a list of Delete instances, configure them, and call the following method:

void delete(List<Delete> deletes) throws IOException

Example 3-13 shows where three different rows are affected during the operation, deleting various details they contain. When you run this example, you will see a printout of the before and after states of the delete. The output is printing the raw KeyValue instances, using KeyValue.toString().

Note

Just as with the other list-based operation, you cannot make any assumption regarding the order in which the deletes are applied on the remote servers. The API is free to reorder them to make efficient use of the single RPC per affected region server. If you need to enforce specific orders of how operations are applied, you would need to batch those calls into smaller groups and ensure that they contain the operations in the desired order across the batches. In a worst-case scenario, you would need to send separate delete calls altogether.

Example 3-13. Application deleting a list of values

    List<Delete> deletes = new ArrayList<Delete>(); 

    Delete delete1 = new Delete(Bytes.toBytes("row1"));
    delete1.setTimestamp(4); 
    deletes.add(delete1);

    Delete delete2 = new Delete(Bytes.toBytes("row2"));
    delete2.deleteColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1")); 
    delete2.deleteColumns(Bytes.toBytes("colfam2"), Bytes.toBytes("qual3"), 5); 
    deletes.add(delete2);

    Delete delete3 = new Delete(Bytes.toBytes("row3"));
    delete3.deleteFamily(Bytes.toBytes("colfam1")); 
    delete3.deleteFamily(Bytes.toBytes("colfam2"), 3); 
    deletes.add(delete3);

    table.delete(deletes); 

    table.close();

: Create a list that holds the Delete instances.
: Set a timestamp for row deletes.
: Delete the latest version only in one column.
: Delete the given and all older versions in another column.
: Delete the entire family, all columns and versions.
: Delete the given and all older versions in the entire column family, that is, from all columns therein.
: Delete the data from multiple rows in the HBase table.

The output you should see is:^[54]

Before delete call...
KV: row1/colfam1:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam1:qual2/4/Put/vlen=4, Value: val4
KV: row1/colfam1:qual2/3/Put/vlen=4, Value: val3
KV: row1/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam1:qual3/5/Put/vlen=4, Value: val5

KV: row1/colfam2:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam2:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row1/colfam2:qual2/3/Put/vlen=4, Value: val3
KV: row1/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam2:qual3/5/Put/vlen=4, Value: val5

KV: row2/colfam1:qual1/2/Put/vlen=4, Value: val2
KV: row2/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row2/colfam1:qual2/4/Put/vlen=4, Value: val4
KV: row2/colfam1:qual2/3/Put/vlen=4, Value: val3
KV: row2/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row2/colfam1:qual3/5/Put/vlen=4, Value: val5

KV: row2/colfam2:qual1/2/Put/vlen=4, Value: val2
KV: row2/colfam2:qual1/1/Put/vlen=4, Value: val1
KV: row2/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row2/colfam2:qual2/3/Put/vlen=4, Value: val3
KV: row2/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row2/colfam2:qual3/5/Put/vlen=4, Value: val5

KV: row3/colfam1:qual1/2/Put/vlen=4, Value: val2
KV: row3/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row3/colfam1:qual2/4/Put/vlen=4, Value: val4
KV: row3/colfam1:qual2/3/Put/vlen=4, Value: val3
KV: row3/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row3/colfam1:qual3/5/Put/vlen=4, Value: val5

KV: row3/colfam2:qual1/2/Put/vlen=4, Value: val2
KV: row3/colfam2:qual1/1/Put/vlen=4, Value: val1
KV: row3/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row3/colfam2:qual2/3/Put/vlen=4, Value: val3
KV: row3/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row3/colfam2:qual3/5/Put/vlen=4, Value: val5

After delete call...
KV: row1/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam1:qual3/5/Put/vlen=4, Value: val5

KV: row1/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam2:qual3/5/Put/vlen=4, Value: val5

KV: row2/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row2/colfam1:qual2/4/Put/vlen=4, Value: val4
KV: row2/colfam1:qual2/3/Put/vlen=4, Value: val3
KV: row2/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row2/colfam1:qual3/5/Put/vlen=4, Value: val5

KV: row2/colfam2:qual1/2/Put/vlen=4, Value: val2
KV: row2/colfam2:qual1/1/Put/vlen=4, Value: val1
KV: row2/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row2/colfam2:qual2/3/Put/vlen=4, Value: val3
KV: row2/colfam2:qual3/6/Put/vlen=4, Value: val6

KV: row3/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row3/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row3/colfam2:qual3/5/Put/vlen=4, Value: val5

The deleted original data is highlighted in the Before delete call... block. All three rows contain the same data, composed of two column families, three columns in each family, and two versions for each column.

The example code first deletes, from the entire row, everything up to version 4. This leaves the columns with versions 5 and 6 as the remainder of the row content.

It then goes about and uses the two different column-related delete calls on row2 to remove the newest cell in the column named colfam1:qual1, and subsequently every cell with a version of 5 and older—in other words, those with a lower version number—from colfam1:qual3. Here you have only one matching cell, which is removed as expected in due course.

Lastly, operating on row-3, the code removes the entire column family colfam1, and then everything with a version of 3 or less from colfam2. During the execution of the example code, you will see the printed KeyValue details, using something like this:

System.out.println("KV: " + kv.toString() +
          ", Value: " + Bytes.toString(kv.getValue()))

By now you are familiar with the usage of the Bytes class, which is used to print out the value of the KeyValue instance, as returned by the getValue() method. This is necessary because the KeyValue.toString() output (as explained in The KeyValue class) is not printing out the actual value, but rather the key part only. The toString() does not print the value since it could be very large.

Here, the example code inserts the column values, and therefore knows that these are short and human-readable; hence it is safe to print them out on the console as shown. You could use the same mechanism in your own code for debugging purposes.

Please refer to the entire example code in the accompanying source code repository for this book. You will see how the data is inserted and retrieved to generate the discussed output.

What is left to talk about is the error handling of the list-based delete() call. The handed-in deletes parameter, that is, the list of Delete instances, is modified to only contain the failed delete instances when the call returns. In other words, when everything has succeeded, the list will be empty. The call also throws the exception—if there was one—reported from the remote servers. You will have to guard the call using a try/catch, for example, and react accordingly. Example 3-14 may serve as a starting point.

Example 3-14. Deleting faulty data from HBase

    Delete delete4 = new Delete(Bytes.toBytes("row2"));
    delete4.deleteColumn(Bytes.toBytes("BOGUS"), Bytes.toBytes("qual1")); 
    deletes.add(delete4);

    try {
      table.delete(deletes); 
    } catch (Exception e) {
      System.err.println("Error: " + e); 
    }
    table.close();

    System.out.println("Deletes length: " + deletes.size()); 
    for (Delete delete : deletes) {
      System.out.println(delete); 
    }

: Add the bogus column family to trigger an error.
: Delete the data from multiple rows in the HBase table.
: Guard against remote exceptions.
: Check the length of the list after the call.
: Print out the failed delete for debugging purposes.

Example 3-14 modifies Example 3-13 but adds an erroneous delete detail: it inserts a BOGUS column family name. The output is the same as that for Example 3-13, but has some additional details printed out in the middle part:

Before delete call...
KV: row1/colfam1:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam1:qual1/1/Put/vlen=4, Value: val1
...
KV: row3/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row3/colfam2:qual3/5/Put/vlen=4, Value: val5

Error: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: 
  Failed 1 action: NoSuchColumnFamilyException: 1 time, 
  servers with issues: 10.0.0.43:59057, 
Deletes length: 1
row=row2, ts=9223372036854775807, families={(family=BOGUS, keyvalues= 
  (row2/BOGUS:qual1/9223372036854775807/Delete/vlen=0)}

After delete call...
KV: row1/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam1:qual3/5/Put/vlen=4, Value: val5
...
KV: row3/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row3/colfam2:qual3/5/Put/vlen=4, Value: val5

As expected, the list contains one remaining Delete instance: the one with the bogus column family. Printing out the instance—Java uses the implicit toString() method when printing an object—reveals the internal details of the failed delete. The important part is the family name being the obvious reason for the failure. You can use this technique in your own code to check why an operation has failed. Often the reasons are rather obvious indeed.

Finally, note the exception that was caught and printed out in the catch statement of the example. It is the same RetriesExhaustedWithDetailsException you saw twice already. It reports the number of failed actions plus how often it did retry to apply them, and on which server. An advanced task that you will learn about in later chapters is how to verify and monitor servers so that the given server address could be useful to find the root cause of the failure.

Atomic compare-and-delete

You saw in Atomic compare-and-set how to use an atomic, conditional operation to insert data into a table. There is an equivalent call for deletes that gives you access to server-side, read-and-modify functionality:

boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier,
  byte[] value, Delete delete) throws IOException

You need to specify the row key, column family, qualifier, and value to check before the actual delete operation is performed. Should the test fail, nothing is deleted and the call returns a false. If the check is successful, the delete is applied and true is returned. Example 3-15 shows this in context.

Example 3-15. Application deleting values using the atomic compare-and-set operations

    Delete delete1 = new Delete(Bytes.toBytes("row1"));
    delete1.deleteColumns(Bytes.toBytes("colfam1"), Bytes.toBytes("qual3")); 

    boolean res1 = table.checkAndDelete(Bytes.toBytes("row1"),
      Bytes.toBytes("colfam2"), Bytes.toBytes("qual3"), null, delete1); 
    System.out.println("Delete successful: " + res1); 

    Delete delete2 = new Delete(Bytes.toBytes("row1"));
    delete2.deleteColumns(Bytes.toBytes("colfam2"), Bytes.toBytes("qual3")); 
    table.delete(delete2);

    boolean res2 = table.checkAndDelete(Bytes.toBytes("row1"),
      Bytes.toBytes("colfam2"), Bytes.toBytes("qual3"), null, delete1); 
    System.out.println("Delete successful: " + res2); 

    Delete delete3 = new Delete(Bytes.toBytes("row2"));
    delete3.deleteFamily(Bytes.toBytes("colfam1")); 

    try{
      boolean res4 = table.checkAndDelete(Bytes.toBytes("row1"),
        Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), 
        Bytes.toBytes("val1"), delete3);
      System.out.println("Delete successful: " + res4); 
    } catch (Exception e) {
      System.err.println("Error: " + e);
    }

: Create a new Delete instance.
: Check if the column does not exist and perform an optional delete operation.
: Print out the result; it should be “Delete successful: false.”
: Delete the checked column manually.
: Attempt to delete the same cell again.
: Print out the result; it should be “Delete successful: true,” as the column now already exists.
: Create yet another Delete instance, but using a different row.
: Try to delete it while checking a different row.
: We will not get here, as an exception is thrown beforehand!

The entire output of the example should look like this:

Before delete call...
KV: row1/colfam1:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam1:qual2/4/Put/vlen=4, Value: val4
KV: row1/colfam1:qual2/3/Put/vlen=4, Value: val3
KV: row1/colfam1:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam1:qual3/5/Put/vlen=4, Value: val5
KV: row1/colfam2:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam2:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row1/colfam2:qual2/3/Put/vlen=4, Value: val3
KV: row1/colfam2:qual3/6/Put/vlen=4, Value: val6
KV: row1/colfam2:qual3/5/Put/vlen=4, Value: val5
Delete successful: false
Delete successful: true
After delete call...
KV: row1/colfam1:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam1:qual2/4/Put/vlen=4, Value: val4
KV: row1/colfam1:qual2/3/Put/vlen=4, Value: val3
KV: row1/colfam2:qual1/2/Put/vlen=4, Value: val2
KV: row1/colfam2:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam2:qual2/4/Put/vlen=4, Value: val4
KV: row1/colfam2:qual2/3/Put/vlen=4, Value: val3
Error: org.apache.hadoop.hbase.DoNotRetryIOException: 
  org.apache.hadoop.hbase.DoNotRetryIOException: 
    Action's getRow must match the passed row
...

Using null as the value parameter triggers the nonexistence test, that is, the check is successful if the column specified does not exist. Since the example code inserts the checked column before the check is performed, the test will initially fail, returning false and aborting the delete operation.

The column is then deleted by hand and the check-and-modify call is run again. This time the check succeeds and the delete is applied, returning true as the overall result.

Just as with the put-related CAS call, you can only perform the check-and-modify on the same row. The example attempts to check on one row key while the supplied instance of Delete points to another. An exception is thrown accordingly, once the check is performed. It is allowed, though, to check across column families—for example, to have one set of columns control how the filtering is done for another set of columns.

This example cannot justify the importance of the check-and-delete operation. In distributed systems, it is inherently difficult to perform such operations reliably, and without incurring performance penalties caused by external locking approaches, that is, where the atomicity is guaranteed by the client taking out exclusive locks on the entire row. When the client goes away during the locked phase the server has to rely on lease recovery mechanisms ensuring that these rows are eventually unlocked again. They also cause additional RPCs to occur, which will be slower than a single, server-side operation.

Batch Operations

You have seen how you can add, retrieve, and remove data from a table using single or list-based operations. In this section, we will look at API calls to batch different operations across multiple rows.

Note

In fact, a lot of the internal functionality of the list-based calls, such as delete(List<Delete> deletes) or get(List<Get> gets), is based on the batch() call. They are more or less legacy calls and kept for convenience. If you start fresh, it is recommended that you use the batch() calls for all your operations.

The following methods of the client API represent the available batch operations. You may note the introduction of a new class type named Row, which is the ancestor, or parent class, for Put, Get, and Delete.

void batch(List<Row> actions, Object[] results) 
  throws IOException, InterruptedException
Object[] batch(List<Row> actions) 
  throws IOException, InterruptedException

Using the same parent class allows for polymorphic list items, representing any of these three operations. It is equally easy to use these calls, just like the list-based methods you saw earlier. Example 3-16 shows how you can mix the operations and then send them off as one server call.

Warning

Be aware that you should not mix a Delete and Put operation for the same row in one batch call. The operations will be applied in a different order that guarantees the best performance, but also causes unpredictable results. In some cases, you may see fluctuating results due to race conditions.

Example 3-16. Application using batch operations

    private final static byte[] ROW1 = Bytes.toBytes("row1");
    private final static byte[] ROW2 = Bytes.toBytes("row2");
    private final static byte[] COLFAM1 = Bytes.toBytes("colfam1"); 
    private final static byte[] COLFAM2 = Bytes.toBytes("colfam2");
    private final static byte[] QUAL1 = Bytes.toBytes("qual1");
    private final static byte[] QUAL2 = Bytes.toBytes("qual2");

    List<Row> batch = new ArrayList<Row>(); 

    Put put = new Put(ROW2);
    put.add(COLFAM2, QUAL1, Bytes.toBytes("val5")); 
    batch.add(put);

    Get get1 = new Get(ROW1);
    get1.addColumn(COLFAM1, QUAL1); 
    batch.add(get1);

    Delete delete = new Delete(ROW1);
    delete.deleteColumns(COLFAM1, QUAL2); 
    batch.add(delete);

    Get get2 = new Get(ROW2);
    get2.addFamily(Bytes.toBytes("BOGUS")); 
    batch.add(get2);

    Object[] results = new Object[batch.size()]; 
    try {
      table.batch(batch, results);
    } catch (Exception e) {
      System.err.println("Error: " + e); 
    }

    for (int i = 0; i < results.length; i++) {
      System.out.println("Result[" + i + "]: " + results[i]); 
    }

: Use constants for easy reuse.
: Create a list to hold all values.
: Add a Put instance.
: Add a Get instance for a different row.
: Add a Delete instance.
: Add a Get instance that will fail.
: Create a result array.
: Print an error that was caught.
: Print all results.

You should see the following output on the console:

Before batch call...
KV: row1/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam1:qual2/2/Put/vlen=4, Value: val2
KV: row1/colfam1:qual3/3/Put/vlen=4, Value: val3

Result[0]: keyvalues=NONE
Result[1]: keyvalues={row1/colfam1:qual1/1/Put/vlen=4}
Result[2]: keyvalues=NONE
Result[3]: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: 
  org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: 
    Column family BOGUS does not exist in ...

After batch call...
KV: row1/colfam1:qual1/1/Put/vlen=4, Value: val1
KV: row1/colfam1:qual3/3/Put/vlen=4, Value: val3
KV: row2/colfam2:qual1/1308836506340/Put/vlen=4, Value: val5

Error: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: 
 Failed 1 action: NoSuchColumnFamilyException: 1 time, 
 servers with issues: 10.0.0.43:60020,

As with the previous examples, there is some wiring behind the printed lines of code that inserts a test row before executing the batch calls. The content is printed first, then you will see the output from the example code, and finally the dump of the rows after everything else. The deleted column was indeed removed, and the new column was added to the row as expected.

Finding the result of the Get operation requires you to investigate the middle part of the output, that is, the lines printed by the example code. The lines starting with Result[n]—with n ranging from zero to 3—is where you see the outcome of the corresponding operation in the actions parameter. The first operation in the example is a Put, and the result is an empty Result instance, containing no KeyValue instances. This is the general contract of the batch calls; they return a best match result per input action, and the possible types are listed in Table 3-7.

Table 3-7. Possible result values returned by the batch() calls

Result	Description
`null`	The operation has failed to communicate with the remote server.
Empty `Result`	Returned for successful `Put` and `Delete` operations.
`Result`	Returned for successful `Get` operations, but may also be empty when there was no matching row or column.
`Throwable`	In case the servers return an exception for the operation it is returned to the client as-is. You can use it to check what went wrong and maybe handle the problem automatically in your code.

Looking further through the returned result array in the console output you can see the empty Result instances printing keyvalues=NONE. The Get call succeeded and found a match, returning the KeyValue instances accordingly. Finally, the operation with the BOGUS column family has the exception for your perusal.

Note

When you use the batch() functionality, the included Put instances will not be buffered using the client-side write buffer. The batch() calls are synchronous and send the operations directly to the servers; no delay or other intermediate processing is used. This is obviously different compared to the put() calls, so choose which one you want to use carefully.

There are two different batch calls that look very similar. The difference is that one needs to have the array handed into the call, while the other creates it for you. So why do you need both, and what—if any—semantical differences do they expose? Both throw the RetriesExhaustedWithDetailsException that you saw already, so the crucial difference is that

void batch(List<Row> actions, Object[] results) 
  throws IOException, InterruptedException

gives you access to the partial results, while

Object[] batch(List<Row> actions) 
  throws IOException, InterruptedException

does not! The latter throws the exception and nothing is returned to you since the control flow of the code is interrupted before the new result array is returned.

The former function fills your given array and then throws the exception. The code in Example 3-16 makes use of that fact and hands in the results array. Summarizing the features, you can say the following about the batch() functions:

Both calls: Supports gets, puts, and deletes. If there is a problem executing any of them, a client-side exception is thrown, reporting the issues. The client-side write buffer is not used.
void batch(actions, results): Gives access to the results of all succeeded operations, and the remote exceptions for those that failed.
Object[] batch(actions): Only returns the client-side exception; no access to partial results is possible.

Note

All batch operations are executed before the results are checked: even if you receive an error for one of the actions, all the other ones have been applied. In a worst-case scenario, all actions might return faults, though.

On the other hand, the batch code is aware of transient errors, such as the NotServingRegionException (indicating, for instance, that a region has been moved), and is trying to apply the action multiple times. The hbase.client.retries.number configuration property (by default set to 10) can be adjusted to increase, or reduce, the number of retries.

Row Locks

Mutating operations—like put(), delete(), checkAndPut(), and so on—are executed exclusively, which means in a serial fashion, for each row, to guarantee row-level atomicity. The region servers provide a row lock feature ensuring that only a client holding the matching lock can modify a row. In practice, though, most client applications do not provide an explicit lock, but rather rely on the mechanism in place that guards each operation separately.

Warning

You should avoid using row locks whenever possible. Just as with RDBMSes, you can end up in a situation where two clients create a deadlock by waiting on a locked row, with the lock held by the other client.

While the locks wait to time out, these two blocked clients are holding on to a handler, which is a scarce resource. If this happens on a heavily used row, many other clients will lock the remaining few handlers and block access to the complete server for all other clients: the server will not be able to serve any row of any region it hosts.

To reiterate: do not use row locks if you do not have to. And if you do, use them sparingly!

When you send, for example, a put() call to the server with an instance of Put, created with the following constructor:

Put(byte[] row)

which is not providing a RowLock instance parameter, the servers will create a lock on your behalf, just for the duration of the call. In fact, from the client API you cannot even retrieve this short-lived, server-side lock instance.

Instead of relying on the implicit, server-side locking to occur, clients can also acquire explicit locks and use them across multiple operations on the same row. This is done using the following calls:

RowLock lockRow(byte[] row) throws IOException
void unlockRow(RowLock rl) throws IOException

The first call, lockRow(), takes a row key and returns an instance of RowLock, which you can hand in to the constructors of Put or Delete subsequently. Once you no longer require the lock, you must release it with the accompanying unlockRow() call.

Each unique lock, provided by the server for you, or handed in by you through the client API, guards the row it pertains to against any other lock that attempts to access the same row. In other words, locks must be taken out against an entire row, specifying its row key, and—once it has been acquired—will protect it against any other concurrent modification.

While a lock on a row is held by someone—whether by the server briefly or a client explicitly—all other clients trying to acquire another lock on that very same row will stall, until either the current lock has been released, or the lease on the lock has expired. The latter case is a safeguard against faulty processes holding a lock for too long—or possibly indefinitely.

Note

The default timeout on locks is one minute, but can be configured system-wide by adding the following property key to the hbase-site.xml file and setting the value to a different, millisecond-based timeout:

<property>
  <name>hbase.regionserver.lease.period</name>
  <value>120000</value>
</property>

Adding the preceding code would double the timeout to 120 seconds, or two minutes, instead. Be careful not to set this value too high, since every client trying to acquire an already locked row will have to block for up to that timeout for the lock in limbo to be recovered.

Example 3-17 shows how a user-generated lock on a row will block all concurrent readers.

Example 3-17. Using row locks explicitly

    static class UnlockedPut implements Runnable { 
      @Override
      public void run() {
        try {
          HTable table = new HTable(conf, "testtable");
          Put put = new Put(ROW1);
          put.add(COLFAM1, QUAL1, VAL3);
          long time = System.currentTimeMillis();
          System.out.println("Thread trying to put same row now...");
          table.put(put); 
          System.out.println("Wait time: " +
            (System.currentTimeMillis() - time) + "ms");
        } catch (IOException e) {
          System.err.println("Thread error: " + e);
        }
      }
    }

    System.out.println("Taking out lock...");
    RowLock lock = table.lockRow(ROW1); 
    System.out.println("Lock ID: " + lock.getLockId());

    Thread thread = new Thread(new UnlockedPut()); 
    thread.start();

    try {
      System.out.println("Sleeping 5secs in main()..."); 
      Thread.sleep(5000);
    } catch (InterruptedException e) {
      // ignore
    }

    try {
      Put put1 = new Put(ROW1, lock); 
      put1.add(COLFAM1, QUAL1, VAL1);
      table.put(put1);

      Put put2 = new Put(ROW1, lock); 
      put2.add(COLFAM1, QUAL1, VAL2);
      table.put(put2);
    } catch (Exception e) {
      System.err.println("Error: " + e);
    } finally {
      System.out.println("Releasing lock..."); 
      table.unlockRow(lock);
    }

: Use an asynchronous thread to update the same row, but without a lock.
: The put() call will block until the lock is released.
: Lock the entire row.
: Start the asynchronous thread, which will block.
: Sleep for some time to block other writers.
: Create a Put using its own lock.
: Create another Put using its own lock.
: Release the lock, which will make the thread continue.

When you run the example code, you should see the following output on the console:

Taking out lock...
Lock ID: 4751274798057238718
Sleeping 5secs in main()...
Thread trying to put same row now...
Releasing lock...
Wait time: 5007ms
After thread ended...
KV: row1/colfam1:qual1/1300775520118/Put/vlen=4, Value: val2
KV: row1/colfam1:qual1/1300775520113/Put/vlen=4, Value: val1
KV: row1/colfam1:qual1/1300775515116/Put/vlen=4, Value: val3

You can see how the explicit lock blocks the thread using a different, implicit lock. The main thread sleeps for five seconds, and once it wakes up, it calls put() twice, setting the same column to two different values, respectively.

Once the main thread releases the lock, the thread’s run() method continues to execute and applies the third put call. An interesting observation is how the puts are applied on the server side. Notice that the timestamps of the KeyValue instances show the third put having the lowest timestamp, even though the put was seemingly applied last. This is caused by the fact that the put() call in the thread was executed before the two puts in the main thread, after it had slept for five seconds. Once a put is sent to the servers, it is assigned a timestamp—assuming you have not provided your own—and then tries to acquire the implicit lock. But the example code has already taken out the lock on that row, and therefore the server-side processing stalls until the lock is released, five seconds and a tad more later. In the preceding output, you can also see that it took seven milliseconds to execute the two put calls in the main thread and to unlock the row.

It makes sense to lock rows for any row mutation, but what about retrieving data? The Get class has a constructor that lets you specify an explicit lock:

Get(byte[] row, RowLock rowLock)

This is actually legacy and not used at all on the server side. In fact, the servers do not take out any locks during the get operation. They instead apply a multiversion concurrency control-style^[55] mechanism ensuring that row-level read operations, such as get() calls, never return half-written data—for example, what is written by another thread or client.

Think of this like a small-scale transactional system: only after a mutation has been applied to the entire row can clients read the changes. While a mutation is in progress, all reading clients will be seeing the previous state of all columns.

When you try to use an explicit row lock that you have acquired earlier but failed to use within the lease recovery time range, you will receive an error from the servers, in the form of an UnknownRowLockException. It tells you that the server has already discarded the lock you are trying to use. Drop it in your code and acquire a new one to recover from this state.

Scans

Now that we have discussed the basic CRUD-type operations, it is time to take a look at scans, a technique akin to cursors^[56] in database systems, which make use of the underlying sequential, sorted storage layout HBase is providing.

Introduction

Use of the scan operations is very similar to the get() methods. And again, similar to all the other functions, there is also a supporting class, named Scan. But since scans are similar to iterators, you do not have a scan() call, but rather a getScanner(), which returns the actual scanner instance you need to iterate over. The available methods are:

ResultScanner getScanner(Scan scan) throws IOException
ResultScanner getScanner(byte[] family) throws IOException
ResultScanner getScanner(byte[] family, byte[] qualifier) 
  throws IOException

The latter two are for your convenience, implicitly creating an instance of Scan on your behalf, and subsequently calling the getScanner(Scan scan) method.

The Scan class has the following constructors:

Scan()
Scan(byte[] startRow, Filter filter)
Scan(byte[] startRow)
Scan(byte[] startRow, byte[] stopRow)

The difference between this and the Get class is immediately obvious: instead of specifying a single row key, you now can optionally provide a startRow parameter—defining the row key where the scan begins to read from the HBase table. The optional stopRow parameter can be used to limit the scan to a specific row key where it should conclude the reading.

Note

The start row is always inclusive, while the end row is exclusive. This is often expressed as [startRow, stopRow) in the interval notation.

A special feature that scans offer is that you do not need to have an exact match for either of these rows. Instead, the scan will match the first row key that is equal to or larger than the given start row. If no start row was specified, it will start at the beginning of the table.

It will also end its work when the current row key is equal to or greater than the optional stop row. If no stop row was specified, the scan will run to the end of the table.

There is another optional parameter, named filter, referring to a Filter instance. Often, though, the Scan instance is simply created using the empty constructor, as all of the optional parameters also have matching getter and setter methods that can be used instead.

Once you have created the Scan instance, you may want to add more limiting details to it—but you are also allowed to use the empty scan, which would read the entire table, including all column families and their columns. You can narrow down the read data using various methods:

Scan addFamily(byte [] family)
Scan addColumn(byte[] family, byte[] qualifier)

There is a lot of similar functionality compared to the Get class: you may limit the data returned by the scan in setting the column families to specific ones using addFamily(), or, even more constraining, to only include certain columns with the addColumn() call.

Note

If you only need subsets of the data, narrowing the scan’s scope is playing into the strengths of HBase, since data is stored in column families and omitting entire families from the scan results in those storage files not being read at all. This is the power of column-oriented architecture at its best.

Scan setTimeRange(long minStamp, long maxStamp) throws IOException
Scan setTimeStamp(long timestamp)
Scan setMaxVersions()
Scan setMaxVersions(int maxVersions)

A further limiting detail you can add is to set the specific timestamp you want, using setTimestamp(), or a wider time range with setTimeRange(). The same applies to setMaxVersions(), allowing you to have the scan only return a specific number of versions per column, or return them all.

Scan setStartRow(byte[] startRow)
Scan setStopRow(byte[] stopRow)
Scan setFilter(Filter filter)
boolean hasFilter()

Using setStartRow(), setStopRow(), and setFilter(), you can define the same parameters the constructors exposed, all of them limiting the returned data even further, as explained earlier. The additional hasFilter() can be used to check that a filter has been assigned.

There are a few more related methods, listed in Table 3-8.

Table 3-8. Quick overview of additional methods provided by the Scan class

Method	Description
`getStartRow()/getStopRow()`	Can be used to retrieve the currently assigned values.
`getTimeRange()`	Retrieves the associated timestamp or time range of the `Get` instance. Note that there is no `getTimeStamp()` since the API converts a value assigned with `setTimeStamp()` into a `TimeRange` instance internally, setting the minimum and maximum values to the given timestamp.
`getMaxVersions()`	Returns the currently configured number of versions that should be retrieved from the table for every column.
`getFilter()`	Special filter instances can be used to select certain columns or cells, based on a wide variety of conditions. You can get the currently assigned filter using this method. It may return `null` if none was previously set. See Filters for details.
`setCacheBlocks()/getCacheBlocks()`	Each HBase region server has a block cache that efficiently retains recently accessed data for subsequent reads of contiguous information. In some events it is better to not engage the cache to avoid too much churn when doing full table scans. These methods give you control over this feature.
`numFamilies()`	Convenience method to retrieve the size of the family map, containing the families added using the `addFamily()` or `addColumn()` calls.
`hasFamilies()`	Another helper to check if a family—or column—has been added to the current instance of the `Scan` class.
`getFamilies()/setFamilyMap()/getFamilyMap()`	These methods give you access to the column families and specific columns, as added by the `addFamily()` and/or `addColumn()` calls. The family map is a map where the key is the family name and the value is a list of added column qualifiers for this particular family. The `getFamilies()` returns an array of all stored families, i.e., containing only the family names (as `byte[]` arrays).

Once you have configured the Scan instance, you can call the HTable method, named getScanner(), to retrieve the ResultScanner instance. We will discuss this class in more detail in the next section.

The ResultScanner Class

Scans do not ship all the matching rows in one RPC to the client, but instead do this on a row basis. This obviously makes sense as rows could be very large and sending thousands, and most likely more, of them in one call would use up too many resources, and take a long time.

The ResultScanner converts the scan into a get-like operation, wrapping the Result instance for each row into an iterator functionality. It has a few methods of its own:

Result next() throws IOException
Result[] next(int nbRows) throws IOException
void close()

You have two types of next() calls at your disposal. The close() call is required to release all the resources a scan may hold explicitly.

Make sure you release a scanner instance as quickly as possible. An open scanner holds quite a few resources on the server side, which could accumulate to a large amount of heap space being occupied. When you are done with the current scan call close(), and consider adding this into a try/finally construct to ensure it is called, even if there are exceptions or errors during the iterations.

The example code does not follow this advice for the sake of brevity only.

Like row locks, scanners are protected against stray clients blocking resources for too long, using the same lease-based mechanisms. You need to set the same configuration property to modify the timeout threshold (in milliseconds):

<property>
  <name>hbase.regionserver.lease.period</name>
  <value>120000</value>
</property>

You need to make sure that the property is set to an appropriate value that makes sense for locks and the scanner leases.

The next() calls return a single instance of Result representing the next available row. Alternatively, you can fetch a larger number of rows using the next(int nbRows) call, which returns an array of up to nbRows items, each an instance of Result, representing a unique row. The resultant array may be shorter if there were not enough rows left. This obviously can happen just before you reach the end of the table, or the stop row. Otherwise, refer to The Result class for details on how to make use of the Result instances. This works exactly like you saw in Get Method.

Example 3-18 brings together the explained functionality to scan a table, while accessing the column data stored in a row.

Example 3-18. Using a scanner to access data in a table

    Scan scan1 = new Scan(); 
    ResultScanner scanner1 = table.getScanner(scan1); 
    for (Result res : scanner1) {
      System.out.println(res); 
    }
    scanner1.close(); 

    Scan scan2 = new Scan();
    scan2.addFamily(Bytes.toBytes("colfam1")); 
    ResultScanner scanner2 = table.getScanner(scan2);
    for (Result res : scanner2) {
      System.out.println(res);
    }
    scanner2.close();

    Scan scan3 = new Scan();
    scan3.addColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("col-5")).
      addColumn(Bytes.toBytes("colfam2"), Bytes.toBytes("col-33")). 
      setStartRow(Bytes.toBytes("row-10")).
      setStopRow(Bytes.toBytes("row-20"));
    ResultScanner scanner3 = table.getScanner(scan3);
    for (Result res : scanner3) {
      System.out.println(res);
    }
    scanner3.close();

: Create an empty Scan instance.
: Get a scanner to iterate over the rows.
: Print the row’s content.
: Close the scanner to free remote resources.
: Add one column family only; this will suppress the retrieval of “colfam2”.
: Use a builder pattern to add very specific details to the Scan.

The code inserts 100 rows with two column families, each containing 100 columns. The scans performed vary from the full table scan, to one that only scans one column family, and finally to a very restrictive scan, limiting the row range, and only asking for two very specific columns. The output should look like this:

Scanning table #3...
keyvalues={row-10/colfam1:col-5/1300803775078/Put/vlen=8, 
           row-10/colfam2:col-33/1300803775099/Put/vlen=9}
keyvalues={row-100/colfam1:col-5/1300803780079/Put/vlen=9, 
           row-100/colfam2:col-33/1300803780095/Put/vlen=10}
keyvalues={row-11/colfam1:col-5/1300803775152/Put/vlen=8,   
           row-11/colfam2:col-33/1300803775170/Put/vlen=9}
keyvalues={row-12/colfam1:col-5/1300803775212/Put/vlen=8,   
           row-12/colfam2:col-33/1300803775246/Put/vlen=9}
keyvalues={row-13/colfam1:col-5/1300803775345/Put/vlen=8,
           row-13/colfam2:col-33/1300803775376/Put/vlen=9}
keyvalues={row-14/colfam1:col-5/1300803775479/Put/vlen=8,
           row-14/colfam2:col-33/1300803775498/Put/vlen=9}
keyvalues={row-15/colfam1:col-5/1300803775554/Put/vlen=8,
           row-15/colfam2:col-33/1300803775582/Put/vlen=9}
keyvalues={row-16/colfam1:col-5/1300803775665/Put/vlen=8,
           row-16/colfam2:col-33/1300803775687/Put/vlen=9}
keyvalues={row-17/colfam1:col-5/1300803775734/Put/vlen=8,
           row-17/colfam2:col-33/1300803775748/Put/vlen=9}
keyvalues={row-18/colfam1:col-5/1300803775791/Put/vlen=8,
           row-18/colfam2:col-33/1300803775805/Put/vlen=9}
keyvalues={row-19/colfam1:col-5/1300803775843/Put/vlen=8,
           row-19/colfam2:col-33/1300803775859/Put/vlen=9}
keyvalues={row-2/colfam1:col-5/1300803774463/Put/vlen=7,
           row-2/colfam2:col-33/1300803774485/Put/vlen=8}

Once again, note the actual rows that have been matched. The lexicographical sorting of the keys makes for interesting results. You could simply pad the numbers with zeros, which would result in a more human-readable sort order. This is completely under your control, so choose carefully what you need.

Caching Versus Batching

So far, each call to next() will be a separate RPC for each row—even when you use the next(int nbRows) method, because it is nothing else but a client-side loop over next() calls. Obviously, this is not very good for performance when dealing with small cells (see Client-side write buffer for a discussion). Thus it would make sense to fetch more than one row per RPC if possible. This is called scanner caching and is disabled by default.

You can enable it at two different levels: on the table level, to be effective for all scan instances, or at the scan level, only affecting the current scan. You can set the table-wide scanner caching using these HTable calls:

void setScannerCaching(int scannerCaching)
int getScannerCaching()

Note

You can also change the default value of 1 for the entire HBase setup. You do this by adding the following configuration key to the hbase-site.xml configuration file:

<property>
  <name>hbase.client.scanner.caching</name>
  <value>10</value>
</property>

This would set the scanner caching to 10 for all instances of Scan. You can still override the value at the table and scan levels, but you would need to do so explicitly.

The setScannerCaching() call sets the value, while getScannerCaching() retrieves the current value. Every time you call getScanner(scan) thereafter, the API will assign the set value to the scan instance—unless you use the scan-level settings, which take highest precedence. This is done with the following methods of the Scan class:

void setCaching(int caching)
int getCaching()

They work the same way as the table-wide settings, giving you control over how many rows are retrieved with every RPC. Both types of next() calls take these settings into account.

You may need to find a sweet spot between a low number of RPCs and the memory used on the client and server. Setting the scanner caching higher will improve scanning performance most of the time, but setting it too high can have adverse effects as well: each call to next() will take longer as more data is fetched and needs to be transported to the client, and once you exceed the maximum heap the client process has available it may terminate with an OutOfMemoryException.

Warning

When the time taken to transfer the rows to the client, or to process the data on the client, exceeds the configured scanner lease threshold, you will end up receiving a lease expired error, in the form of a ScannerTimeoutException being thrown.

Example 3-19 showcases the issue with the scanner leases.

Example 3-19. Timeout while using a scanner

    Scan scan = new Scan();
    ResultScanner scanner = table.getScanner(scan);

    int scannerTimeout = (int) conf.getLong(
      HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY, -1); 
    try {
      Thread.sleep(scannerTimeout + 5000); 
    } catch (InterruptedException e) {
      // ignore
    }
    while (true){
      try {
        Result result = scanner.next();
        if (result == null) break;
        System.out.println(result); 
      } catch (Exception e) {
        e.printStackTrace();
        break;
      }
    }
    scanner.close();

: Get the currently configured lease timeout.
: Sleep a little longer than the lease allows.
: Print the row’s content.

The code gets the currently configured lease period value and sleeps a little longer to trigger the lease recovery on the server side. The console output (abbreviated for the sake of readability) should look similar to this:

Adding rows to table...
Current (local) lease period: 60000
Sleeping now for 65000ms...
Attempting to iterate over scanner...
Exception in thread "main" java.lang.RuntimeException: 
  org.apache.hadoop.hbase.client.ScannerTimeoutException: 65094ms passed 
    since the last invocation, timeout is currently set to 60000
	at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext
	at ScanTimeoutExample.main
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException: 65094ms 
    passed since the last invocation, timeout is currently set to 60000
	at org.apache.hadoop.hbase.client.HTable$ClientScanner.next
	at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext
	... 1 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException: 
  org.apache.hadoop.hbase.UnknownScannerException: Name: -315058406354472427
	at org.apache.hadoop.hbase.regionserver.HRegionServer.next
...

The example code prints its progress and, after sleeping for the specified time, attempts to iterate over the rows the scanner should provide. This triggers the said timeout exception, while reporting the configured values.

Note

You might be tempted to add the following into your code:

Configuration conf = HBaseConfiguration.create()
conf.setLong(HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY, 120000)

assuming this increases the lease threshold (in this example, to two minutes). But that is not going to work as the value is configured on the remote region servers, not your client application. Your value is not being sent to the servers, and therefore will have no effect.

If you want to change the lease period setting you need to add the appropriate configuration key to the hbase-site.xml file on the region servers—while not forgetting to restart them for the changes to take effect!

The stack trace in the console output also shows how the ScannerTimeoutException is a wrapper around an UnknownScannerException. It means that the next() call is using a scanner ID that has since expired and been removed in due course. In other words, the ID your client has memorized is now unknown to the region servers—which is the name of the exception.

So far you have learned to use client-side scanner caching to make better use of bulk transfers between your client application and the remote region’s servers. There is an issue, though, that was mentioned in passing earlier: very large rows. Those—potentially—do not fit into the memory of the client process. HBase and its client API have an answer for that : batching. You can control batching using these calls:

void setBatch(int batch)
int getBatch()

As opposed to caching, which operates on a row level, batching works on the column level instead. It controls how many columns are retrieved for every call to any of the next() functions provided by the ResultScanner instance. For example, setting the scan to use setBatch(5) would return five columns per Result instance.

Note

When a row contains more columns than the value you used for the batch, you will get the entire row piece by piece, with each next Result returned by the scanner.

The last Result may include fewer columns, when the total number of columns in that row is not divisible by whatever batch it is set to. For example, if your row has 17 columns and you set the batch to 5, you get four Result instances, with 5, 5, 5, and the remaining two columns within.

The combination of scanner caching and batch size can be used to control the number of RPCs required to scan the row key range selected. Example 3-20 uses the two parameters to fine-tune the size of each Result instance in relation to the number of requests needed.

Example 3-20. Using caching and batch parameters for scans

  private static void scan(int caching, int batch) throws IOException {
    Logger log = Logger.getLogger("org.apache.hadoop");
    final int[] counters = {0, 0};
    Appender appender = new AppenderSkeleton() {
      @Override
      protected void append(LoggingEvent event) {
        String msg = event.getMessage().toString();
        if (msg != null && msg.contains("Call: next")) {
          counters[0]++;
        }
      }
      @Override
      public void close() {}
      @Override
      public boolean requiresLayout() {
        return false;
      }
    };
    log.removeAllAppenders();
    log.setAdditivity(false);
    log.addAppender(appender);
    log.setLevel(Level.DEBUG);


    Scan scan = new Scan();
    scan.setCaching(caching);  
    scan.setBatch(batch);
    ResultScanner scanner = table.getScanner(scan);
    for (Result result : scanner) {
      counters[1]++; 
    }
    scanner.close();
    System.out.println("Caching: " + caching + ", Batch: " + batch +
      ", Results: " + counters[1] + ", RPCs: " + counters[0]);
  }

  public static void main(String[] args) throws IOException {
    scan(1, 1);
    scan(200, 1);
    scan(2000, 100); 
    scan(2, 100);
    scan(2, 10);
    scan(5, 100);
    scan(5, 20);
    scan(10, 10);
  }

: Set caching and batch parameters.
: Count the number of Results available.
: Test various combinations.

The code prints out the values used for caching and batching, the number of results returned by the servers, and how many RPCs were needed to get them. For example:

Caching: 1, Batch: 1, Results: 200, RPCs: 201
Caching: 200, Batch: 1, Results: 200, RPCs: 2
Caching: 2000, Batch: 100, Results: 10, RPCs: 1
Caching: 2, Batch: 100, Results: 10, RPCs: 6
Caching: 2, Batch: 10, Results: 20, RPCs: 11
Caching: 5, Batch: 100, Results: 10, RPCs: 3
Caching: 5, Batch: 20, Results: 10, RPCs: 3
Caching: 10, Batch: 10, Results: 20, RPCs: 3

You can tweak the two numbers to see how they affect the outcome. Table 3-9 lists a few selected combinations. The numbers relate to Example 3-20, which creates a table with two column families, adds 10 rows, with 10 columns per family in each row. This means there are a total of 200 columns—or cells, as there is only one version for each column—with 20 columns per row.

Table 3-9. Example settings and their effects

Caching	Batch	Results	RPCs	Notes
1	1	200	201	Each column is returned as a separate `Result` instance. One more RPC is needed to realize the scan is complete.
200	1	200	2	Each column is a separate `Result`, but they are all transferred in one RPC (plus the extra check).
2	10	20	11	The batch is half the row width, so 200 divided by 10 is 20 `Results` needed. 10 RPCs (plus the check) to transfer them.
5	100	10	3	The batch is too large for each row, so all 20 columns are batched. This requires 10 `Result` instances. Caching brings the number of RPCs down to two (plus the check).
5	20	10	3	This is the same as above, but this time the batch matches the columns available. The outcome is the same.
10	10	20	3	This divides the table into smaller `Result` instances, but larger caching also means only two RPCs are needed.

Note

To compute the number of RPCs required for a scan, you need to first multiply the number of rows with the number of columns per row (at least some approximation). Then you divide that number by the smaller value of either the batch size or the columns per row. Finally, divide that number by the scanner caching value. In mathematical terms this could be expressed like so:

RPCs = (Rows * Cols per Row) / Min(Cols per Row, Batch Size) / 
Scanner Caching

In addition, RPCs are also required to open and close the scanner. You would need to add these two calls to get the overall total of remote calls when dealing with scanners.

Figure 3-2 shows how the caching and batching works in tandem. It has a table with nine rows, each containing a number of columns. Using a scanner caching of six, and a batch set to three, you can see that three RPCs are necessary to ship the data across the network (the dashed, rounded-corner boxes).

Figure 3-2. The scanner caching and batching controlling the number of RPCs

The small batch value causes the servers to group three columns into one Result, while the scanner caching of six causes one RPC to transfer six rows—or, more precisely, results—sent in the batch. When the batch size is not specified but scanner caching is specified, the result of the call will contain complete rows, because each row will be contained in one Result instance. Only when you start to use the batch mode are you getting access to the intra-row scanning functionality.

You may not have to worry about the consequences of using scanner caching and batch mode initially, but once you try to squeeze the optimal performance out of your setup, you should keep all of this in mind and find the sweet spot for both values.

Miscellaneous Features

Before looking into more involved features that clients can use, let us first wrap up a handful of miscellaneous features and functionality provided by HBase and its client API.

The HTable Utility Methods

The client API is represented by an instance of the HTable class and gives you access to an existing HBase table. Apart from the major features we already discussed, there are a few more notable methods of this class that you should be aware of:

void close(): This method was mentioned before, but for the sake of completeness, and its importance, it warrants repeating. Call close() once you have completed your work with a table. It will flush any buffered write operations: the close() call implicitly invokes the flushCache() method.
byte[] getTableName(): This is a convenience method to retrieve the table name.
Configuration getConfiguration(): This allows you to access the configuration in use by the HTable instance. Since this is handed out by reference, you can make changes that are effective immediately.
HTableDescriptor getTableDescriptor(): As explained in Tables, each table is defined using an instance of the HTableDescriptor class. You gain access to the underlying definition using getTableDescriptor().
static boolean isTableEnabled(table): There are four variants of this static helper method. They all need either an explicit configuration—if one is not provided, it will create one implicitly using the default values, and the configuration found on your application’s classpath—and a table name. It checks if the table in question is marked as enabled in ZooKeeper.
byte[][] getStartKeys() byte[][] getEndKeys() Pair<byte[][],byte[][]> getStartEndKeys(): These calls give you access to the current physical layout of the table—this is likely to change when you are adding more data to it. The calls give you the start and/or end keys of all the regions of the table. They are returned as arrays of byte arrays. You can use Bytes.toStringBinary(), for example, to print out the keys.
void clearRegionCache() HRegionLocation getRegionLocation(row) Map<HRegionInfo, HServerAddress> getRegionsInfo(): This set of methods lets you retrieve more details regarding where a row lives, that is, in what region, and the entire map of the region information. You can also clear out the cache if you wish to do so. These calls are only for advanced users that wish to make use of this information to, for example, route traffic or perform work close to where the data resides.
void prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap) static void setRegionCachePrefetch(table, enable) static boolean getRegionCachePrefetch(table): Again, this is a group of methods for advanced usage. In Implementation it was mentioned that it would make sense to prefetch region information on the client to avoid more costly lookups for every row—until the local cache is stable. Using these calls, you can either warm up the region cache while providing a list of regions—you could, for example, use getRegionsInfo() to gain access to the list, and then process it—or switch on region prefetching for the entire table.

The Bytes Class

You saw how this class was used to convert native Java types, such as String, or long, into the raw, byte array format HBase supports natively. There are a few more notes that are worth mentioning about the class and its functionality.

Most methods come in three variations, for example:

static long toLong(byte[] bytes)
static long toLong(byte[] bytes, int offset)
static long toLong(byte[] bytes, int offset, int length)

You hand in just a byte array, or an array and an offset, or an array, an offset, and a length value. The usage depends on the originating byte array you have. If it was created by toBytes() beforehand, you can safely use the first variant, and simply hand in the array and nothing else. All the array contains is the converted value.

The API, and HBase internally, store data in larger arrays, though, using, for example, the following call:

static int putLong(byte[] bytes, int offset, long val)

This call allows you to write the long value into a given byte array, at a specific offset. If you want to access the data in that larger byte array you can make use of the latter two toLong() calls instead.

The Bytes class has support to convert from and to the following native Java types: String, boolean, short, int, long, double, and float. Apart from that, there are some noteworthy methods, which are listed in Table 3-10.

Table 3-10. Overview of additional methods provided by the Bytes class

Method	Description
`toStringBinary()`	While working very similar to `toString()`, this variant has an extra safeguard to convert nonprintable data into their human-readable hexadecimal numbers. Whenever you are not sure what a byte array contains you should use this method to print its content, for example, to the console, or into a logfile.
`compareTo()/equals()`	These methods allow you to compare two `byte[]`, that is, byte arrays. The former gives you a comparison result and the latter a boolean value, indicating whether the given arrays are equal to each other.
`add()/head()/tail()`	You can use these to add two byte arrays to each other, resulting in a new, concatenated array, or to get the first, or last, few bytes of the given byte array.
`binarySearch()`	This performs a binary search in the given array of values. It operates on byte arrays for the values and the key you are searching for.
`incrementBytes()`	This increments a `long` value in its byte array representation, as if you had used `toBytes(long)` to create it. You can decrement using a negative `amount` parameter.

There is some overlap of the Bytes class to the Java-provided ByteBuffer. The difference is that the former does all operations without creating new class instances. In a way it is an optimization, because the provided methods are called many times within HBase, while avoiding possibly costly garbage collection issues.

For the full documentation, please consult the JavaDoc-based API documentation.^[57]

^[49]The region servers use a multiversion concurrency control mechanism, implemented internally by the ReadWriteConsistencyControl (RWCC) class, to guarantee that readers can read without having to wait for writers. Writers do need to wait for other writers to complete, though, before they can continue.

^[50]Universally Unique Identifier; see http://en.wikipedia.org/wiki/Universally_unique_identifier for details.

^[51]See “Unix time” on Wikipedia.

^[52]See the API documentation for the KeyValue class for a complete description.

^[53]See “Remote procedure call” on Wikipedia.

^[54]For easier readability, the related details were broken up into groups using blank lines.

^[55]See “MVCC” on Wikipedia.

^[56]Scans are similar to nonscrollable cursors. You need to declare, open, fetch, and eventually close a database cursor. While scans do not need the declaration step, they are otherwise used in the same way. See “Cursors” on Wikipedia.

^[57]See the Bytes documentation online.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3. Client API: The Basics

Create new playlist

Sign In

Sign Up

Chapter 3. Client API: The Basics

General Notes

Note

CRUD Operations

Note

Put Method

Single Puts

Note

Note

Note

The KeyValue class

Note

Client-side write buffer

Note

Note

Note

Warning

List of Puts

Warning

Warning

Atomic compare-and-set

Note

Warning

Get Method

Single Gets

Note

Note

The Result class

Note

Note

List of Gets

Note

Related retrieval methods

Note

Warning

Delete Method

Single Deletes

Warning

List of Deletes

Note

Atomic compare-and-delete

Batch Operations

Note

Warning

Note

Note

Row Locks

Warning

Note

Scans

Introduction

Note

Note

The ResultScanner Class

Caching Versus Batching

Note

Warning

Note

Note

Note

Miscellaneous Features

The HTable Utility Methods

The Bytes Class

Table of Contents for
3. Client API: The Basics