Framing

Framing is the first step to be done when network packets are received, and the last one when network messages are sent. In fact, the need for this step depends on the network protocol being used. Network protocols are classified in two categories:

Datagram protocols
Stream protocols

Datagram protocols operate on datagrams, that is, messages. Datagrams are the base unit sent on the network. So a producer writes a datagram on the network, and a receiver receives this same datagram, or nothing if an error occurred. UDP is an example of a datagram protocol.

Stream protocols operate on bytes. Bytes are the base units that are sent on the network. So, a producer writes bytes on the network link, and a receiver receives these bytes, but not necessarily split in the same way. For example, a producer may have sent ten bytes, and the receiver first reads five bytes, then three bytes, then the remaining two bytes. TCP is an example of a stream protocol.

Framing is a transformation that consists of transforming bytes to datagrams, and vice versa. This step is necessary because the upper layers operate on datagrams (messages), not bytes. So, when a datagram protocol is used, then the framing part is not needed because the network can send the datagram directly. However on a stream protocol, each datagram must be encoded (framed) before being sent as bytes, so that the receiver can decode (unframe) the received bytes to the original datagram. The following figure shows the principle of framing:

Figure 12.13: Framing

On the application side, a dataframe must be sent on the network link. This dataframe is composed of four bytes: one byte containing the foo information, and three bytes containing the bar information. When such a dataframe is framed, a header is first sent on the network. This header (one byte in this example) gives information about the data that will follow on the network. Then the dataframe payload is sent on the network link, eventually, byte after byte. When such a dataframe is received, the first byte to be received is the header. Then all of the four bytes are received, either all at once, or in parts (up to four parts consisting of one byte). In this case, the header contains the information needed to know when the dataframe is complete.

There are many possible ways to implement framing. They vary in complexity depending on the guarantees provided by the transport layer. The transport protocols considered here provide many guarantees (on data ordering and validity). So, simple framing algorithms can be used. Two framing algorithm families are widely used:

Line-based framing
Length-prefixed framing

Line-based framing is adapted to simple situations where dataframes are composed of text. This is typically the case when dataframes are JSON strings. With line-based framing, a new line character is used to split or recompose dataframes. So, this framing simply consists of adding a new line character on the network link at the end of each dataframe. The advantage of this framing is that it is very simple. The drawback is that it is limited to text data, and special care is needed if this data contains new lines.

Length-prefixed framing works with any kind of payload. With length-prefixed framing, a header is added before the dataframe. This header is a number containing the number of bytes present in the following dataframe. The advantage of this framing is that it works with text and binary data. However, it is a little bit more complex than line-based framing because the endianness of the header must be handled correctly so that it works with all CPU architectures (little and big-endian). Sometimes, a magic pattern is also added in the header to ensure that framing does not go out of sync in case of data loss on the network link.

Table of Contents for Framing

Create new playlist

Sign In

Sign Up

Table of Contents for
Framing