So far, we have looked at establishing a secure communications channel between an embedded device and a server. To build an IoT platform, we need two further things: a communication protocol to transfer data and an encoding format to provide a message schema. This chapter will look at the current dominant communication protocol used by IoT devices. This is the Message Queued Telemetry Protocol (MQTT) originally developed by IBM and now an ISO standard. MQTT has been widely adopted by commercial IoT services to become one of the key protocols used by constrained IoT devices to transfer data to a cloud server and other IoT devices. We will also look at some commonly used data encoding formats that allow you to develop a flexible message schema.
IoT networking; Data formats; Message queued telemetry transport; Quality of service; PC broker; Client
So far, we have looked at establishing a secure communications channel between an embedded device and a server. To build an IoT platform, we need two further things: a communication protocol to transfer data and an encoding format to provide a message schema. This chapter will look at the current dominant communication protocol used by IoT devices. This is the Message Queued Telemetry Protocol (MQTT) originally developed by IBM and now an ISO standard. MQTT has been widely adopted by commercial IoT services to become one of the key protocols used by constrained IoT devices to transfer data to a cloud server and other IoT devices. We will also look at some commonly used data encoding formats that allow you to develop a flexible message schema.
We will begin with a broad overview of the MQTT and then do some hands-on examples to have a deeper look at how the protocol works. Developing network applications can be complicated because many different elements are involved: the embedded node, the communications protocol, and the remote server. It is not always clear why a system is not working or even where the fault lies. It can often feel like debugging squared. Through this chapter, we will build up the system in incremental steps as a way of testing that each additional new layer works. This helps to build confidence in the overall system and makes it easy to trap errors early. In the hands-on exercises, we will start with PC-based tools to send and receive unencrypted messages. We can then add an embedded node and then check this works correctly. To create a fully functional system, we can then add TLS encryption and use the PC client to test our X.509 certificates. This will ensure we have the network credentials correct before adding encryption to the embedded node. By the end of the chapter, we will have set up an IoT system using components on a local network. This can then be used as a platform for future experiments.
MQTT is a communications protocol originally developed by IBM that has now been adopted as an ISO/OASIS standard. Currently, MQTT is the most widely used and supported communications protocol for small IoT devices. MQTT is designed to be a light-weight protocol for small, constrained devices with a low software footprint. It is widely supported with free-to-use tools and software libraries for both PCs and embedded nodes. MQTT is royalty-free for use within commercial and noncommercial products.
The MQTT protocol is a client-server architecture. A typical system will consist of a cloud-based server, which is called a Broker (Fig. 6.1). Each IoT device is a client of the Broker and can send and receive messages to and from the Broker.
MQTT uses a publish and subscribe communications model. Each client can connect to the Broker and subscribe to a particular message known as a Topic. Any client may connect to the Broker and publish new data to any Topic. When the data within a Topic is updated, a message with the new data will be broadcast to any and all clients that are subscribed to that Topic. In this case, clients may be other IoT devices. Additionally, a typical Broker will also provide an HTTPS socket interface for services and REST-based applications. MQTT Brokers are designed to support several thousand IoT devices allowing you to build a large scale system very easily. Most commercial IoT services will support MQTT with regional servers to provide a global network infrastructure.
When an MQTT client connects to a Broker, it can publish a message Topic. The Broker does not need to be preconfigured with a list of available Topic messages. A new Topic will automatically be added to the Broker database when first published by a client (Fig. 6.2). A message Topic will also be created if any client subscribes to it and it is not currently in the Broker database.
A Topic consists of a message header followed by the message data. A Topic is a form of addressing that creates a message hierarchy similar to the folder structure within a file system. A typical Topic header is shown below:
It is also worth noting that adding a final forward slash creates a different Topic:
Each Topic header is a UTF-8 string that consists of a number of user-defined levels delimited by Topic separators in the form of a forward slash (/). Our message data will then follow the Topic header. The message data can be in any arbitrary format, but it is common to use a standard data interchange format such as JAVA Script Object Notation (JSON) and more recently, Concise Binary Object Representation (CBOR).
Any client can subscribe to any topic, whether a publishing node has created it or not. If you subscribe to a nonexisting Topic, the Broker will be forced to create it. When a client subscribes to a topic, it is possible to use a range of wildcards to receive groups of messages. The hash (#) character can be used to receive all messages below the topic level elements in the header.
Each message Topic will also be assigned Quality of Service (QoS). The QoS defines the delivery effort between the client and the Broker. The MQTT protocol defines three QoS levels. Level zero relies on the network transport layer to make a best-effort attempt to deliver the message but does not guarantee that it will be delivered. If you select QoS level one, the MQTT protocol does guarantee delivery, but the message may be received more than once, while QoS level two guarantees that a message will be delivered exactly once with no duplicates (Table 6.1).
QoS level | Message delivery | Delivery semantics | Delivery guarantees |
---|---|---|---|
0 | ≤ 1 | At most once | Best effort. No guarantee |
1 | ≥ 1 | At least once | Guarantees delivery. Duplicates possible |
2 | = 1 | Exactly once | Guarantees delivery. No duplicates |
When you are defining message QoS Levels, it is important to realize that the QoS Level is defined by both the publishing client and the subscribing client (Fig. 6.3).
Thus a client could publish a message at QoS Level 1 where a message may be delivered to the Broker with possible duplicates. A separate client could subscribe to the same Topic with a QoS Level 2 and expect a single update without duplicated messages.
It is also possible for a client to publish a message topic to the Broker and define it as a “retained message.” This will cause the Broker to store the message along with its QoS. When a new client subscribes to the message topic or an existing subscribed client reconnects to the Broker, the stored message will be immediately sent to the subscribing client. This is very useful when communicating with low power nodes that may frequently connect and disconnect to a network.
When a client connects to a server, it defines a "keep-alive period" in seconds. Then during normal operation, the client must send at least one message during the keep-alive period. If the node is a subscriber-only, it must send a PINGREQ message and receive a PINGACK back from the Broker. If the Broker does not receive any communication from a client in one and a half times the keep-alive period, it must disconnect the client. Similarly, the client must disconnect from the Broker if it does not receive any messages and the keep-alive period is exceeded.
When a client connects to a Broker it can publish a “last will and testament” message. This is a normal MQTT message that any other client can subscribe to. However, the published message is stored until the Broker determines that it has lost communication with the client. When this happens the last will and testament message will be sent to all the subscribed clients. The Broker will determine it has lost communication with the client if the keep-alive period is exceeded and it has not received a disconnect message.
The MQTT protocol is encapsulated within a network transport. This is normally a TCP message which provides for guaranteed delivery. Since the MQTT protocol is designed to support resource constrained devices, a relatively small number of methods support the connect/disconnect, publish, and subscribe model (Table 6.2).
MQTT message | 4 bit code | Description |
---|---|---|
CONNECT | 1 | Client request to connect to server |
CONNACK | 2 | Connect acknowledge |
PUBLISH | 3 | Publish message |
PUBACK | 4 | Publish acknowledge |
PUBREC | 5 | Publish received (Assured delivery part 1) |
PUBREL | 6 | Publish received (Assured delivery part 2) |
PUBCOMP | 7 | Publish complete |
SUBSCRIBE | 8 | Client subscribe request |
SUBACK | 9 | Unsubscribe acknowledge |
UNSUBSCRIBE | 10 | Client unsubscribe request |
UNSUBACK | 11 | Unsubscribe acknowledge |
PINGREC | 12 | PING request |
PINGRESP | 13 | PING response |
DISCONNECT | 14 | Client disconnect |
In this section, we will install an MQTT Broker on a development PC along with a test client. We can then check both the Broker and client's operation before we move to design an embedded node. This also allows us to generate and test X.509 certificates before using them with an IoT device. We will use the popular Eclipse Mosquitto Broker and a widely used client called MQTT.fx for these examples.
Download both applications from the links in Table 6.3.
Application | URL | Description |
---|---|---|
Mosquitto | https://mosquitto.org/download/ | MQTT broker |
MQTT.fx | http://mqttfx.jensd.de/index.php/download | MQTT PC client |
This tutorial assumes you have installed the Mosquitto Broker as a window service by following the default installation settings. Depending on the version of the Mosquitto server, you may also need to install OpenSSL for supporting dll files.
This exercise contains an embedded MQTT project that we will use in the next section along with a directory called utilities, which contains some X.509 certificates and a configuration file for the Mosquitto broker.
You must use the c:certs directory or change the paths in the Mosquitto configuration file.
If the Mosquitto Broker does not restart it will most likely be an error in the configuration file, probably failing to load the certificates (Fig. 6.5).
This configuration file will enable three ports on the server, which we can use for testing (Table 6.4).
Port | Service |
---|---|
1883 | MQTT unencrypted communication |
8883 | MQTT with TLS protocol. Broker authentication |
8884 | MQTT with TLS protocol. Broker and device authentication |
Once the Mosquitto Broker is running with the new configuration file, we can connect to it with the MQTT.fx client.
This will set the broker address as the local host (127.0.0.1) and the plain text port (1883).
The settings button (2) will allow you to define your own profile.
This will connect you to the Broker. When this happens, the gray circle (3) will turn green to show the connection and an open padlock will show that the connection is unencrypted.
Here, I have used test/var/param1 but you could use anything, just avoid a slash at the start or end of the topic.
This will publish the message to the Broker where it will be echoed to any client which has subscribed to the same Topic (Fig. 6.10).
Now that we have a working broker and client, we can add an MQTT client to our WiFi example and connect to the local broker. The Keil pack system provides a set of cloud connectors for commercial IoT cloud systems, including a plain MQTT broker.
The MQTT Paho client is part of the eclipse project and is designed to integrate to any network stack that provides a socket-type interface.
In this example, we will connect an embedded client to our local Mosquitto client running on your PC and then to a cloud server. For the local connection, you will need to create a rule or exception within your PC firewall to allow a remote client to connect.
This is a multiproject workspace. The exercise project is our WiFi example and the second project is a working solution.
Now add the Paho MQTT client using the microVision RTE.
Once the device is connected to the WiFi network, it will invoke a function called MQTTEcho_Test(). This code is provided as a standard template.
As its name implies, this will add example code to add a simple echo test similar to the one we just performed with the PC client.
The address mqtt.eclipse.org is the address of a public test server.
Once we have tested a local connection to the PC, we can set the server address to use the eclipse server and test against a public cloud server:
#define SERVER_PORT 1883 MQTTClient client; Network network; MQTTPacket_connectData connectData = MQTTPacket_connectData_initializer; NetworkInit(&network); MQTTClientInit(&client, &network, 30000, sendbuf, sizeof(sendbuf), readbuf, sizeof(readbuf));
NetworkConnect(&network, SERVER_NAME, SERVER_PORT); connectData.MQTTVersion = 3; connectData.clientID.cstring = "MDK_sample"; MQTTConnect(&client, &connectData);
Once a WiFi connection is established, the echo code will create a network and client instance and then connect to the Broker using the unencrypted port 1883.
Once connected, it will subscribe to a collection of topics using the # wildcard.
The subscription function defines a callback function void messageArrived (MessageData* data) which will be triggered when a matching message is received:
MQTTSubscribe(&client, "MDK/sample/#", QOS2, messageArrived);
The messageArrived callback function is passed a pointer to a message data structure:
This in turn contains structures for the message topic and the message payload:
typedef struct MQTTMessage{enum QoS qos;unsigned char retained;unsigned char dup;unsigned short id;void *payload;size_t payloadlen; } MQTTMessage; typedef struct { char* cstring; MQTTLenString lenstring; } MQTTString;
Once we have subscribed to the message topics, we can create a message payload:
MQTTMessage message;
char payload[30]; message.qos = QOS1; message.retained = 0; message.payload = payload; message.payloadlen = strlen(payload);
And then publish the message as the Topic “MDK/sample/a”
MQTTPublish(&client, "MDK/sample/a", &message);
The example code publishes ten messages and then disconnects. We can provide a delay between each message by using a dedicated delay function provided by the MQTT client function:
Unlike an osDelay function this will suspend the thread for 1000 mSec while still allowing for MQTT messages to be received.
When all of the messages have been published, we will disconnect from the Broker and network:
You should see the arrival of each message in the debugger and the MQTT.fx client (Fig. 6.13).
If not check your PC firewall settings.
When we publish a message, it is possible to set the retained option. This will keep the last topic payload within the Broker. When a client subscribes to the topic, the retained payload will immediately be sent.
You will be immediately sent the last message published to this Topic (Fig. 6.14).
When we established a connection to the Broker the connection parameters were passed in the variable connectData which was initialized with its default settings.
MQTTPacket_connectData connectData = MQTTPacket_connectData_initializer;
MQTTPacket_connectData is a structure that contains the elements shown in Table 6.5.
Variable | Type | Description |
---|---|---|
struct_id[4] | char | Must be set to M,Q,T,C |
struct_version | int | Version of this structure. Must be “0” |
MQTTVersion | unsigned char | Version of MQTT to be used. 3 = 3.1 4 = 3.1.1 |
clientID | MQTT string | A custom ASCII string for your device |
keepAliveInterval | unsigned short | Keep alive interval in seconds |
Cleansession | unsigned char | A no persistent connection where no subscriptions are stored |
willFlag | unsigned char | Last will and testament available |
Will | MQTTPacket_willOptions | Last will and testament message |
username | MQTTString | Broker user name if required |
password | MQTTString | Broker password if required |
The connection object allows us to set a keep-alive interval in seconds. The client must send a message to the Broker within one and a half times this period, or the Broker will disconnect the client. If the client does not have a valid message to send, it can send a PINGREQ message to the Broker. In this case, the client should monitor the Broker's PINGRESP reply. If this does not arrive in a timely fashion, the client should disconnect from the network. This mechanism can be disabled if the keep-alive interval is set to zero. If the keep-alive interval is enabled, it can be used in conjunction with the client last will and testament.
When we connect to a server it is possible to define the Last Will and Testament (LWT) of a client device.
This message will be sent to all clients that are subscribed to the LWT topic if the publishing client does not disconnect gracefully. For example, it may run out of battery power and shutdown without performing a network disconnect. This will leave the Broker connection open, but the keep-alive will fail. If this or a similar failure occurs, the Broker will send the LWT message.
We can define a LWT message in the connection object. The will value should be set to logic “1,” and the message can be defined in the “will” object (Table 6.6).
Variable | Type | Description |
---|---|---|
struct_id | Char | Eyecatcher string "WQCT" |
struct_version | Int | Structure version |
topicName | MQTTString | Topic to use for LWT message |
Message | MQTTString | LWT message data |
Retained | unsigned char | Set to “1” if message is retained |
QoS | unsigned char | Set required QoS |
Enter the following code on line 12 of mqtt_echo.c after the definition of connectData.
connectData.keepAliveInterval = 10; connectData.willFlag = 1; MQTTString topic = MQTTString_initializer; topic.cstring = "MDK/LWT"; connectData.will.topicName = topic; MQTTString message = MQTTString_initializer; message.cstring = "Bye"; connectData.will.message = message;
The client will connect to the Broker and define the LWT and a keep-alive interval of ten seconds. It will then subscribe and publish a message as before. When it hits the breakpoint, no further messages will be sent. When one and a half times the keep-alive interval (fifteen seconds) has passed, the LWT will be sent to the subscribed client.
This time the code runs to completion and the LWT message is sent when the networkDisconnect function is called. We have disconnected from the network without disconnecting from the Broker.
This time the embedded code will run to completion without sending the LWT.
In this example, we have added an MQTT client, connected to a Broker and been able to publish and subscribe to MQTT messages. We have also explored the keep-alive interval and the LWT message. However, all of this communication has been in plain text. In the next section, we will look at how to add the TLS protocol.
We are going to add the TLS security in two stages. First, we will set up a system that authenticates the server but let any client connect. Next, we will set up the TSL security to authenticate both the server and client.
In the Mosquitto program directory, locate and open the mosquitto.conf file.
The configuration file allows you to define multiple active MQTT ports. Each port is defined as a listener.
Our first encrypted port is defined in the Broker configuration file as port 8883, and this is the default setting in the mosquitto config file.
In the security section we define the CA certificate, the Broker certificate and the Broker private key.
We can first test that the certificates are setup correctly with the MQTT.fx client.
This should now negotiate a secure connection to the Broker.
You can further test the connection by subscribing and publishing Topics as before.
Now that we have established that the secure port on the Broker is working and the CA certificate is correct we can add the mbedTLS library to the embedded client and establish a secure embedded connection.
In the project directory there is a preconfigured version of the mbedTLS_Conf.h file.
This is setup with the necessary options for the TLS protocol.
The mqtt_echo.c template code is designed to support a secure connection if the following define is declared at a project level.
The define will cause the embedded client to connect to the encrypted port with a secure version of the network connect function:
Here, we need to provide the Certificate Authority certificate, the default is an array called tlscert[].
A template header file is provided for you to add the Broker CA cert. In this example, a preconfigured version is located in the project directory:
The Broker CA certificate is held in const char array and formatted as shown below:
static const char CA_Cert[] = "-----BEGIN CERTIFICATE----- " "MIIElDCCA3ygAwIBAgIBATANBgkqhkiG9w0BAQsFADCBkjELMAkGA1UEBhMCVUsx" …………… "1Do2MazGz + kKBHKdOYtsWa0xVhLOkOjS " "-----END CERTIFICATE----- " ;
The embedded client will now connect to the encrypted TLS port and send the same set of MQTT messages, but this time they will be sent over an encrypted channel.
In this example, we are authenticating the server and allowing any client to connect. The Broker has an additional listening channel on port 8884. This port has the same security options as 8883, but we have enabled the additional option “requires_certificate true.” The security settings in the Mosquitto configuration file are as follows:
cafile C:certsiot_ca.crt certfile C:certsiot_broker.crt keyfile C:certsiot_broker.pem require_certificate true
This will force the Broker to authenticate the client. As before we can test the certificates in mqtt.fx by configuring a new profile for client authentication:
This will now establish a secure connection which authenticates both the client and Broker.
The certificate.h is preconfigured with the client certificate and key.
These are defined in two additional arrays:
We need first switch the port to 8884 so we connect to the mutual authentication port:
Line 14#define SERVER_PORT 8884
Line 30 TLScert tlscert = {(char *)CA_Cert,(char *) ClientCert,(char *) ClientKey};
This will now establish mutually authenticated secure connection with the Broker and allow us to transfer data between the IoT device and the Broker.
So far, we have used ASCII stings as the payload within our MQTT message. In a practical system, we will need to exchange more complex data between IoT devices and the Broker. This is often done by creating a protocol schema that defines the content and representation of the payload data to be transferred. Both ends of the communication pipe have to agree and implement this protocol. Any changes or extensions to the message must be carefully managed to prevent breaking the schema. However, there are a couple of widely used data interchange formats that make this process much easier and have parser components that are available for a wide range of platforms and programming languages. The two formats that we will look at in this section are JavaScript Object Notation (JSON) and Concise Binary Object Representation (CBOR).
Put simply, JSON is a syntax for storing and exchanging data between different computing systems. There are three big advantages to using JSON. First, it is easy to adopt and use. Second, JSON is text-based, making it human-readable and intuitive to understand. Third, it is schema-less, in that the parser will format the message as a set of labels and data which the receiver can interpret without a detailed knowledge of the message structure. This allows you to develop reliable communication packets where the content is evolving over time. The JSON format has achieved widespread adoption, particularly in IoT networks and system configuration files.
A JSON object is an ASCII string enclosed by curly braces and contains a set of key/value pairs. A key is a label for the following data. It is encoded as an ASCII string enclosed by double-quotes. There are various data types that can be used as values and we will look at these below but for example, a quote enclosed string is a valid value, an example is shown below:
{“key_0”: “value_0”}
We can add other key value pairs by using a comma (,) as a delimiter.
The following data types are available within a JSON Object (Table 6.7).
Type | Description | |
---|---|---|
String | Unicode characters enclosed by double quotes (") | "location" : "office", |
Number | Double precision floating point in Java Script format | "number" : 210.3, |
Boolean | True or false | "alarm" : true , |
Null | No value to assign | "zone" : null, |
Object | An unordered set of name value pairs | { "temperature" : 20, "humidity" : 40, } |
Array | An ordered collection of values | record : ["1","2","3"], |
This example will encode and decode some sample data into a JSON string and send this as the payload of our MQTT message. As we are subscribed to the same message, we can receive and decode the JSON data back to usable variables.
cJSON is a lightweight JSON parser that can be used to encode and decode JSON objects.
This module contains functions to serialize and deserialize some example application data.
The serialize function first creates some dummy data values for typical process values of temperature, pressure and humidity. We then create the JSON object and JSON variables for each data value. The cJSON type is a structure with elements to store each JSON type and a linked list to chain successive elements together. We also create a char pointer, which is used to store the resulting JSON serialized string:
char *serialize() { int temperature = 23, pressure = 1000, humidity = 40; cJSON *Jobject, *Jtemperature, *Jpressure, *Jhumidity ; static char *J_string = NULL; .................
Next we can create the JSON object then create and initialize each JSON value:
Jobject = cJSON_CreateObject(); Jtemperature = cJSON_CreateNumber(temperature); Jpressure = cJSON_CreateNumber(pressure); Jhumidity = cJSON_CreateNumber(humidity);
Then we can assemble the JSON object and serialize it to the string:
cJSON_AddItemToObject(Jobject, "temperature", Jtemperature); cJSON_AddItemToObject(Jobject, "pressure", Jpressure); cJSON_AddItemToObject(Jobject, "humidity", Jhumidity); J_string = cJSON_PrintUnformatted(Jobject); cJSON_Delete(Jobject);
The function returns the formatted string, which is used as the payload for the MQTT message.
We can now deserialize the string back to usable data.
In the deserialize function, we create a JSON object and then parse the received string to a JSON object.
int deserialize(const char * const Jstring){ const cJSON *jnum = NULL; cJSON *Jobject = cJSON_Parse(Jstring);
We can then extract the value for each element into a cJSON number:
jnum = cJSON_GetObjectItemCaseSensitive(Jobject, "temperature");
We can then test for the type of data and if a value is present. If all is good we can extract the value into a variable:
JSON also allows us to declare arrays of JSON types. An array obeys the same rules as a key value pair while the value is a collection of comma delimited JSON types enclosed by square brackets:
{"locations":[ "office", "Meeting_1", "Kitchen" ]}
We can extend our serialize() function to add an array.
This function contains code to create and populate a JSON array:
Next we can create some JSON values:
cJSON *location_0 = cJSON_CreateString("Office"); cJSON *location_1 = cJSON_CreateString("Meeting_1");
cJSON *location_2 = cJSON_CreateString("Kitchen");
and then add them to the array:
cJSON_AddItemToArray(Jarray, location_0); cJSON_AddItemToArray(Jarray, location_1); cJSON_AddItemToArray(Jarray, location_2);
In the deserialize function, we can create cJSON variables for the JSON array and the locations:
We can now load read the JSON array within the JSON object and then loop through the stored values and extract the location strings:
Jarray = cJSON_GetObjectItemCaseSensitive(Jobject, "Locations"); for(index = 0; index < 3; index ++) {location[index] = cJSON_GetArrayItem(Jarray,index);if (!cJSON_IsString(location[index]) || location[index]->valuestring != NULL)printf("Location - %s ", location[index]->valuestring); }
You can observe the received message in mqtt.fx and the decoded values in the debugger console window.
It is also possible to have a JSON object within a JSON object. This follows the same rules as a key-value pair, but the value is a self-contained JSON object. While this may look a bit like an array, we are storing key: value pairs. This allows us to search within a secondary JSON object without having to know much about its actual structure. In a real-world system, this becomes a powerful technique.
{"office":{ "temperature":23, "pressure":1000, "humidity":40}}
We create our JSON object as before and then add a further JSON object:
We can then populate the nested object with key/value pairs:
cJSON_AddItemToObject(Jsub_object, "temperature", Jtemperature); cJSON_AddItemToObject(Jsub_object, "pressure", Jpressure); cJSON_AddItemToObject(Jsub_object, "humidity", Jhumidity);
if we receive a nested object it is possible to search for the sub object and then extract values from it:
Jsub_object = cJSON_GetObjectItemCaseSensitive(Jobject, "office");cJSON *Jtemp = cJSON_GetObjectItemCaseSensitive(Jsub_object,"temperature");if (cJSON_IsNumber(Jtemp) && (Jtemp->valueint != NULL)){ printf("Office Temperature - %i ", Jtemp->valueint);}
Rebuild and rerun the code.
You can observe the received message in mqtt.fx and the decoded values in the debugger console window.
While JSON is widely used to encode IoT data, it is designed to transport values encoded as ASCII strings. In many cases, we want to transport large amounts of binary data such as encryption keys and even firmware updates. The only way to do this with JSON is to base 64 encode the data. This significantly increases the complexity and bulk of our data packets. To efficiently transfer data in a binary encoded form, we can use a different encoding format called Concise Binary Object Representation (CBOR). While there are other encoding schemes available such as ASIN.1, BSON and Message Pack, CBOR is becoming more widely used for IoT because a typical encoder/decoder requires minimal resources so that it will run on very limited devices. CBOR is also based on the JSON data model and like JSON, you can easily adapt and extend the message schema without risking breaking the whole communication scheme.
There are a number of open-source CBOR encoders available. The one currently used as part of the Platform Security Architecture firmware is QCBOR. This implementation is optimized for speed and code size.
The CBOR data format describes how to encode different data types as a serialized byte string. Each data item is preceded by a header byte which defines the encoded data type (Fig. 6.20).
The header byte is split into two fields the top three bits is the major type field while the lower five bits is the additional information field.
The major type field describes the data item following the header byte. The current CBOR standard defines seven data types which are encoded as follows (Table 6.8).
CBOR major type | Encoded value |
---|---|
0 | Unsigned integer |
1 | Negative integer |
2 | Byte string |
3 | Text string |
4 | Array of data items |
5 | Map of key/value pairs |
6 | Semantic tag |
7 | Floating point |
Each data type will use the additional information field to fully describe the data item while the data value is held in adjacent bytes (Table 6.9).
Additional data value | Description |
---|---|
0–23 | Simple value (value 0..23) |
24 | Simple value (value 32..255 in following byte) |
25 | IEEE 754 Half-Precision Float (16 bits follow) |
26 | IEEE 754 Single-Precision Float (32 bits follow) |
27 | IEEE 754 Double-Precision Float (64 bits follow) |
28–30 | Unassigned |
31 | "break" stop code for indefinite-length items |
To encode an unsigned integer the major type field will be set to zero. If the value in the integer is in the range 0–23, it will be directly encoded in the additional information field. If the integer value is greater than 23, the remaining values in the additional information field 24–27 specify the number of bytes (1–4 following the header byte which are used to hold the unsigned integer value.
This file contains examples of how to encode/decode the CBOR major types.
We can use QBOR to encode three bytes as shown below:
Next create a context for this CBOR serialization and a structure which contains a pointer to the serialized object and its length:
Now create some input data items and a buffer to hold the final serialized string:
uint8_t data_A = 0x10, data_B = 0x55; pBuf[300];uint16_t data_C = 0xAA55;
In the code, we must first initialize the encoder with the context and output buffer:
QCBOREncode_Init(&EC, UsefulBuf_FROM_BYTE_ARRAY(pBuf));
Now we can encode each unsigned integer. Although the encoder function produces an optimized string for each word size up to uint64_t:
QCBOREncode_AddUInt64(&EC, data_A);QCBOREncode_AddUInt64(&EC, data_B);QCBOREncode_AddUInt64(&EC, data_C);
Finally, the encoder can be exited. The encoded structure contains a pointer to the serialized pBUF array and the length of the CBOR string:
QCBOREncode_Finish(&EC, &Encoded));
This will generate the serialized string shown in Table 6.10.
Encoding | Description |
---|---|
0x10 | Header uint8_t directly encoded |
0x18 | Header uint8_t with one byte following |
0x55 | Data |
0x19 | Header uint16_t with two bytes following |
0xAA | Data |
0x55 | Data |
The CBOR string can now be decoded.
First create our data items and the CBOR context:
Then create a CBOR item which is used to hold the initial decoded value and initialize the decoder with the previously encoded byte string:
We can now read each item in turn:
QCBORDecode_GetNext(&DC, &Item);
Once an item has been read, we can query the data type and then read the data value:
if(Item.uDataType == QCBOR_TYPE_UINT64) { decode_A = Item.val.uint64;}QCBORDecode_GetNext(&DC, &Item);if(Item.uDataType == QCBOR_TYPE_UINT64) { decode_B = Item.val.uint64; }QCBORDecode_GetNext(&DC, &Item);if(Item.uDataType == QCBOR_TYPE_UINT64) { decode_C = Item.val.uint64;}
Once the full string has been parsed, we can exit the decoder and release the context:
It is also possible to encode signed integer using a mix of major type 0 and 1 items:
int8_t data_A = -0x10, data_B = 0x55;int16_t data_C = -0xAA55; QCBOREncode_AddInt64(&EC, data_A); QCBOREncode_AddInt64(&EC, data_B); QCBOREncode_AddInt64(&EC, data_C);
Which encodes to (Table 6.11).
Byte value | Decoding |
---|---|
0x2F | Major type zero. |
0x18 | Major type zero with one byte to follow |
0x55 | Data |
0x39 | Major type one with two bytes to follow |
0xAA | Data |
0x54 | Data |
We can decode the serialized data the same way, but this time, we can test for a signed value and read the signed integer tag:
It is also possible to store text strings byte strings and arrays in a similar fashion:
QCBOREncode_OpenArray(&EC);QCBOREncode_AddUInt64(&EC, 451);QCBOREncode_AddUInt64(&EC, 331);QCBOREncode_CloseArray(&EC);
This encodes as a set bytes of plus an initial header that with the major type set to array and the number of elements is held in the additional data field (Table 6.12).
Encoding | Description |
---|---|
0x82 | Header declares an array with two items |
0x19 | Header for first item |
0x01 | Data |
0xC3 | Data |
0x19 | Header for second item |
0x01 | Data |
0x4B | Data |
We can also encode text and byte strings:
QCBOREncode_Init(&EC, UsefulBuf_FROM_BYTE_ARRAY(nBuf));QCBOREncode_AddText(&EC, UsefulBuf_FROM_SZ_LITERAL("bar bar foo bar"));QCBOREncode_AddSZString(&EC, "oof ");
When decoded the item structure will provide a pointer and length of the string data:
CBOR is also able to store strings of data and text. For fixed-length strings, the length of the string is stored as an integer at the beginning of the string using the Type 0 integer rules (Table 6.13).
CBOR value | Type | Additional | Description |
---|---|---|---|
0x83 | Type 2 | 3 | Array of three elements |
0x19 0x12C | Type 0 | 2 | uint16_t = 300 |
0x18, 0x64 | Type 0 | 1 | uint8_t = 100 |
0x15 | Type 0 | Direct encoding | uint8_t = 21 |
Two types of string are defined: a byte string (Type 2), which contains binary data, and a text string, which holds a human readable ASCII string (Type 3).
We can also store arrays of integers using the same approach as byte arrays. However, the array size is defined as the number of objects, not the number of bytes.
The CBOR data format also supports the storage of maps. Here, maps are tables of key-value pairs which equate to objects in JSON. The first item is the key and the second is the value. Like a data item array, the array's size is the number of map pairs, not the overall number of bytes (Table 6.14).
JSON encoding | CBOR encoding | CBOR description |
---|---|---|
{ | A3 | Map with three items |
"name" : "map1" | 64 6E616D65 | Text with 4 items "name" |
64 6D617031 | Text with 4 items "map1" | |
"type":"number", | 64 74797065 | Text with 4 items "type" |
66 6E756D626572 | Text with 6 items "number" | |
"value": 1 | 65 76616C7565 | Text with 5 items "value" |
} | 01 | Unsigned int 1 |
Encoding = 40 | Encoding = 30 bytes |
CBOR can also handle arrays, strings and maps of variable length. In this case, an array is opened without defining the number of data items. Data items can be added as normal. When all of the current data has been written, the array is closed by writing a data item encoded as a major type 7 plus a break character as the data value (Table 6.15).
In addition to providing the header for standard data types, CBOR provides support for semantic tabs. A tag is header type 6 and contains a single value. This is used as a hint to the decoder to describe a custom format for the following encoded bytes.
This allows us to use tabs to extend CBOR encoding types in a number of useful ways. To start with, we can use tags to create additional data items. The tag is placed immediately before the custom data to provide a self-described blob of binary data. The decoder may understand the tag, or it can be passed to a higher layer for decoding. The CBOR standard defines a number of tags and unassigned ranges for extension a small selection is shown in Table 6.16.
Tag value | Tag semantics | Data encoded as |
---|---|---|
0 | Standard date/time string ie 2021-11-12T23:20:50.52Z | UTF-8 string |
1 | Epoch-based date/time, number of seconds from 1970-01-01T00:00Z | Multiple |
2 | Positive bignum | Byte string |
3 | Negative bignum | Byte string |
4 | Decimal fraction | Array |
5 | Bigfloat | Array |
To encode a positive bignum value the encoder will first write the tag value 0x02 followed by a byte stream, which holds the value of the bignum (Table 6.17).
Bignum value | CBOR encoding | Description |
---|---|---|
0x111111111111111111111111111111 | 0xC2 | Semantic tag 2 positive Bignum |
0x4F | Size of 15 bytes | |
0x111111111111111111111111111111 | Byte string value for the bignum |
The decoder will then return the tab and an integer array, which can then be used to initialize an MPI value.
In addition to the tags defined in the CBOR specification (RFC 7049), the range of tags has been extended and maintained in an IANA registry. The draft list of tags is available from:
We can use these tags to improve an encoded CBOR stream's efficiency. For example, if we need to encode an array of 32-bit floating-point values. In a standard encoding, each float will be stored with a header followed by 4 bytes. By using a tag, we can define a much terser encoding. We can use a tag to declare a float array followed by the byte values of the IEEE754 floating-point number. In the IANA registry, a 32-bit float array is defined by tab 81 follower by a byte string. This allows us to store our array as a single tag followed by a binary blob where each float value is stored as four bytes.
In this chapter, we have looked at using JSON and CBOR to encode application data that will ride within an MQTT packet. However, the uses for both of these data formats are much more widespread. As we will see in the next chapter, JSON is often used to hold configuration information and is used to define the options within an IoT cloud platform. Additional standards such as JSON Object Signing and Encryption (JOSE) and CBOR Object Signing and Encryption (COSE) are used to create web tokens that form the basis of the Entity Attestation Token (EAT), which is used to prove the identity and capabilities of an IoT device.
3.239.76.211