Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Performing high-volume writes (Advanced)

Redis is known as an ultrafast data store because it can not only deliver data at a higher speed but also accept new data into its data set at a higher speed. There is not much performance difference between read and write operations in Redis, leaving persistence aside. It is critical to know how to feed a large set of data into Redis in a short burst of time.

The write operation in Redis can be of two types:

Data operations like set, union, increment, or others.
Bulk data import into Redis. For example, importing millions of records into Redis in a short span by using Redis's protocol or pipelining functionality.

How to do it...

Pipelining makes it possible for clients to send several commands to Redis without waiting for the response, and then it starts waiting for the response from the server. The benefit of pipelining is to drastically increase the performance of communication. The performance improvement will increase approximately five times in case of faster connections and will lower in case of slow connections. The syntax for pipelining varies with the language libraries through which we are accessing Redis.
The following example shows how pipelining works in Ruby using the redis-rb (https://github.com/redis/redis-rb) client library. The code snippet assumes that Redis is running on the local machine and listening on the default port, 6379.
```
require "redis"
redis = Redis.new(:host => "127.0.0.1", :port => 6379)
redis.pipelined do
    redis.set "user", "user1"
    redis.set "userid", 1
    redis.incr "totallogin"
end
```

How it works...

Clients needing to write and update multiple keys as a response to some user action is a common operation. When multiple keys are to be updated, the commands are sent sequentially to Redis. Let us see how the commands are sent and how it affects performance. Assume that we need to write two new keys and increment a counter in Redis as a response to some user action. So there are three commands in total.

In this case, normally we need to perform three different operations:

SET user User1
OK
SET userid 1
OK
INCR totalLogin
(integer) 14

Considering a network roundtrip of 100 ms from the client to the server, and ignoring the Redis execution time, total time to execute all three commands will be:

Total time = (request sent + response from server) + (request sent + response from server) + (request sent+ response from server)

Total time = 100 ms + 100 ms + 100 ms (ignoring Redis's execution time)

Total time = 300 ms

So, the number of commands executed increases the time proportionally. This is because Redis is a TCP server running with request/response protocol. The server and the client are connected through a network socket and are forced to suffer the network latency even if you run the client on the server itself. Consider your Redis setup can process at least 50,000 requests per second, but if your network latency is 100 ms as described above, we can process only a maximum of 10 requests in a second, no matter how fast our Redis server works. As it is not possible to reduce the travel time between the server and client, the solution is to reduce the number of trips made between the server and client. In short, the lesser the number of round-trips, the more the number of requests processed by Redis in a second. Redis provides us with a solution for this problem: pipelining.

Let us take a look at the same example and see how pipelining will help us. As all the commands are sent to the server in a single flush over the wires, there will be only one roundtrip between the server and the client. So the time taken will be a little more than 100 ms. By using pipelining, we can gain a 200 percent performance increase for simple commands.

One thing to be aware of in pipelining is that the server will be forced to queue the response using the memory. So to prevent memory spikes, always send a considerable amount of commands in a single pipeline. We can send a few hundreds of commands, read the response, and then send another batch in the next pipeline to keep a check on memory usage. The performance will be almost the same.

By reducing the number of roundtrips between the server and the clients, pipelining provides an efficient way of writing data into Redis at faster speeds.

There's more...

There might be other situations where it is necessary to import millions of records into Redis in a very short span of time.

Bulk data import

The next type of data imports millions of records in a short span. In this recipe, we will take a look at how to feed Redis with a huge amount of data as fast as possible. Usually, bulk importing of data into Redis is performed in the following scenarios:

When using Redis as a data store, bulk importing of data is done from relational databases such as MySQL
When using the Redis server as a caching server, for prepopulating or warming the cache with most accessed caches
Loading Redis with user-generated data or preexisting data

In these cases, the data to be imported is usually huge and has millions of writes. To achieve the import, using a normal Redis client is not a good idea as sending commands in a sequential manner will be slow and we need to pay for the roundtrip. We can use pipelining, but pipelining makes it impossible for the data inserted to be read at the same time, as Redis will not commit the data till all the commands in the pipeline are executed.

Redis protocol

The most recommended way to mass-import a huge data set is to generate a text file with the commands in the Redis protocol format and use the file to import the data into Redis. The redis-cli interface provides a pipe mode to perform a bulk import from a raw file with commands as per Redis protocol specifications (http://redis.io/topics/protocol).

The protocol is simple and binary safe, and its format is as follows:

*<number of arguments> CR LF
$<number of bytes of argument 1> CR LF
<argument data> CR LF
...
$<number of bytes of argument N> CR LF
<argument data> CR LF

Where CR is (or ASCII character 13) and LF is (or ASCII character 10).

For example, execute the following command:

SET samplekey testvalue

This command will look like the following in the raw file:

*3  - Number of arguments = SET + key + value = 3
$3  - Number of Bytes = 3 - SET has 3 bytes
SET
$8  - Number of bytes in samplekey
samplekey
$9
testvalue - Number of bytes in testvalue

This translates to the following:

*3
$3
SET
$8
samplekey
$9
testvalue

Redis uses the same format for both its request and response. As the protocol itself is simple, a simple program can generate a text file with all the commands in raw format.

Once the text file is generated, the data in the text file, say redis-data.txt, which has 1 million commands, can be imported into Redis using a simple command, as follows:

cat redis-data.txt | redis-cli --pipe

After the execution, the output will look like the following:

All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 1000000

How does the pipe mode work?

The pipe mode not only tries to send the data to the server as fast as possible, but also reads and tries to parse the data when available. When it finds that there is no more data to send, it sends an ECHO command with a random 20 bytes to the server and then starts listening for the server's response. The server sends a response and sends the same 20 bytes to signal the end of the response. Because we use this, redis-cli does not need to know which commands or how many commands were sent to the server. But by counting the response, it provides us with a brief report about the status of our bulk import.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Performing high-volume writes (Advanced)

Create new playlist

Sign In

Sign Up