Kafka and the sendfile operation

One of the traditional inefficiencies built around messaging systems is inefficiency in byte-copying. At low message rates, this is not an issue, but under load, the impact is significant. To avoid this, Kafka employs a standardized binary message format that is shared by the producer, the broker, and the consumer. This is an important design consideration because now the data chunks can be transferred without modification between them. Kafka broker simply maintains a message log, which is essentially a directory of files. Each of these contains the sequence of messages that have been written to the disk and have the same format as used by the producer and the consumer.

Maintaining this common format allows optimization of the most important operation: network transfer of persistent log chunks.

In order to understand the impact of the sendfile operation, it is important to visualize the path taken by data from a file representation to a socket where it is transferred over the network:

  • As the first step, the OS reads data from the disk into the kernel's page-cache
  • The application then reads the data from kernel page-space into a user-space buffer
  • The application writes the data back into kernel space into a socket buffer
  • The operating system copies the data from the socket buffer to the NIC buffer where it is sent over the network

This is clearly inefficient as there are four copies and two system calls. Using sendfile, this recopying is avoided by allowing the OS to send the data from page-cache to the network directly. In this optimized path, only the final copy to the NIC buffer is needed.

We expect a common use case to be multiple consumers on a topic. Using the zero-copy optimization, data is copied into pagecache exactly once and reused on each consumption instead of being stored in memory and copied out to the user space every time it is read. This allows messages to be consumed at a rate that approaches the limit of the network connection.

This combination of pagecache and sendfile means that on a Kafka cluster where the consumers are mostly caught up, one will see no read activity on the disks whatsoever as they will be serving data entirely from the cache.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.46.60