© Alasdair Gilchrist 2016

Alasdair Gilchrist, Industry 4.0, 10.1007/978-1-4842-2047-4_8

8. Middleware Software Patterns

Alasdair Gilchrist

(1)Bangken, Nonthaburi, Thailand

The Industrial Internet requires real-time detection and reaction if it is going to work in time-critical environments, such as controlling machinery on a production line or monitoring the status of medical equipment connected to patients in a hospital. Therefore, the applications in the operations and management domain must receive notification of any change in status—crossing a predetermined threshold—immediately. This is also desirable in the world of the consumer IoT; however, with so many sensors to monitor, how can we do this?

There are two ways that applications can detect changes in an edge device’s status, and the traditional method is to poll devices and request their status or simply read a value from a pin on an integrated circuit. It amounts to the same thing; we are performing a predetermined batch job to poll each device.

The problem with this method is it is complicated to program and to physically connect these entire point-to-point devices to server relationships, as it will soon become very messy. Furthermore, it is up to the programmer or the system designer to set the polling interval. As an example, if we wish to manage a HVAC system in a building, we need to monitor the temperature in each location, but how often do we need to poll the sensors, to read the current temperature? Similarly, if we are monitoring a pressure pad in the building’s security, how often do we need to poll for a change of status?

Presumably, with temperature we would only need to check every minute or so, as there is probably a reasonable amount of acceptable drift before a threshold is breached. With the security pad, I guess we would want to know pretty much immediately that an event has occurred that has forced a change in status. In both these cases, the designer will have to decide on the correct polling time and configure just about every sensor—and there could be thousands of them, monitoring all sorts of things such as smoke, movement, light levels, etc.—to be polled at a predetermined time. This is not very efficient; in fact, it is fraught with difficulties in maintaining the system, let alone expanding it by adding new sensors and circuits.

The second option is to have a change of status at the sensor/device level trigger an event. In this event-driven model, the application need not concern itself with monitoring the sensors, as it will be told when an event has occurred. Furthermore, it will be told immediately and this provides for real-time detection and reaction.

In the scenarios of the temperature sensor and the pressure pad, the designer in an event-driven model no longer has to decide when to safely poll a sensor. The sensor’s management device will tell the operation and management application when an event—a notable change in condition—has occurred.

In the world of the Industrial Internet, event-driven messaging is required as it can facilitate real-time reaction in time-critical use-cases. In addition, event-driven messaging simplifies the communication software and hardware by replacing all those dedicated point-to-point links between applications and sensors with a software message bus by using multicast to replicate and send messages only to registered subscribers.

A message bus is a software concept, although it can also apply to the physical infrastructure if there is a ring or bus configuration. More likely, the physical network will be in a hub/spoke hierarchy, or remote point-to-point links, although that does not affect the concept of the publish/subscribe model, as a broker server is deployed to manage subscriber registration to the publish/subscribe server and handle the distribution of messages between publishers and subscribers regardless of their location.

Therefore, publish/subscribe works regardless of the physical network structure, and as a result publish/subscribe greatly reduces the amount of spaghetti code and network connections, thus making the whole system more manageable. Furthermore, using a message bus model allows for easy upgrading of the system, as adding new sensors/devices becomes a trivial process.

The reason upgrading becomes trivial is due to the nature of the event-driven messaging system. In order for it to be effective all sensors/devices, which are publishers connect via a broker—a server designated to manage the publish/subscribe process and distribute traffic across the message bus, to the registered subscribers. Similarly, all applications, which are subscribers, connect to the message bus.

Publish/Subscribe Pattern

By using a publish/subscribe protocol , applications can individually subscribe to published services that they are interested in, such as the status of the temperature and pressure pad in the earlier smart building scenario. The publish/subscribe protocol can monitor individual or groups of sensors and publish any changes to their status to registered subscribers. In this way applications will learn about a change in the service immediately.

In addition, there is the added benefit that applications, which use a publish/subscribe broker, need only subscribe to services they wish to know about. They can ignore other irrelevant services, which reduces unnecessary traffic and I/O demands on servers.

Indeed, the designer will not even have to worry about programmatically informing applications of new sensors/services as the publish/subscribe broker service will announce or remove new or retired services.

Additionally, and very importantly for many IIoT applications (such as monitoring remote vending machines), the broker does not require constant connectivity as it can cache and deliver updates from a publisher when subscribers comes online. A vending machine—a publisher—in a remote area does not need to have 24/7 connectivity to the broker; it can perhaps connect once a day to upload reports on its local inventory, cash box holdings, and other maintenance data. On the other hand if the vending machine cash box reports it is full and can no longer accept more cash transactions, the publish/subscribe protocol can alert the broker immediately. Similarly, the subscriber service need not be online, as the broker will store and forward messages from publishers to subscribers when possible.

Publish and subscribe has several distribution models, but what is common is the de-coupling of the sensor/service from the application and this is a very powerful concept in software called late-binding. Late-binding allows new services to be added without having to reconfigure the network or reprogram devices or applications, which makes the system very agile. It is also possible using the publish/subscribe protocols to reuse services, in such a way as to create new or add value to existing services. For example, a new service could be produced by averaging the output of a group of three other services, and this could then be published to any interested subscriber.

However, a publish/subscribe protocol may seem a perfect solution to the many use-cases in the IIoT, but it does have its own constraints, which we need to be aware. For example, in the vending machine scenario, it was not critical that we knew immediately, in real time, that a particular vending machine ran out of a brand of fizzy drink. It may well be annoying but the world is not going to end. However, what if the publish/subscribe model is tracking stock market values?

Now we have a problem. A vending machine out of stock, or with a full cash box, can no longer trade and is going to lose money if it does not alert the operational and management domain in a timely manner. However, if we consider fast changing criteria such as stock valuations, we need to know in real time—a second’s delay could result in disaster.

And here is the problem with the centralized publish/subscribe model. By placing a broker server inline between publishers and subscribers, it adds latency. After all the broker service has to handle messages received from publishers and then determine which subscribers require the message. The broker will look up and replicate the message for each subscriber, but that takes time—albeit microseconds—but in the case of the stock market that is not acceptable.

Additionally, the broker needs to perform these functions serially so some subscribers may get updated messages of a shift in stock price before others. Therefore, we need to find a way to distribute subscriber messages in real time. Consequently, there are other methods of deploying the publish/subscribe model that do not require a broker service. One simple method is to broadcast to all hosts on the network the published message using UDP, but that is wasteful as most hosts will simply drop the packets and it is unreliable. The problem here is that it’s better to have a message arrive milliseconds late than never receive the message. After all, remember UDP is fire and forget, although it is close to real time so very tempting to use in real-time applications. Therefore, again we need to consider how we match protocols with applications, as they can require different levels of service.

Another benefit of deploying a publish/subscribe protocol is that it not only connects all applications (subscribers) to all sensors (publishers) via the “bus,” but it can in some cases importantly connect all devices to all devices, which are connected to the bus. What this means in the context of the IIoT is that devices can interconnect and communicate with one another over the publish/subscribe protocol. This means that devices can potentially cooperate and share resources. For example if every device must have long distant communication technology—for example a 3G modem and SIM—they would be bigger, heavier, and more expensive. If, however, smaller, cheaper, and dumber devices could collaborate with their heavy-duty neighbors, they could share resources allowing the smaller device to communicate through the other’s communication channels.

Publish and subscribe and event-driven messenger services greatly enhance the efficiency of IIoT systems and they can be deployed using a number of publish/subscribe models, dependent on application. As with all IIoT protocols, there appears to be many to choose from, so we will discuss each of the publish/subscribe protocols in common use today, including their individual pros and cons and how they compare to each other in the context of the IIoT. The most commonly deployed are:

  • MQTT

  • XMPP

  • AMQP

  • DDS

All of these protocols claim to be genuine publish/subscribe protocols that operate at real time and can handle tens of thousands of devices. However, they are very different and some work at different levels. For example, consider that we can generally categorize these protocols as working at the level of device-to-device, device-to-server, and server-to-server. Then we should have a better understanding and expectation of the performance, measured in time, for each protocol.

Figure 8-1 illustrates the performance expectations of each category.

A416110_1_En_8_Fig1_HTML.jpg
Figure 8-1. Real-time performance expectations

MQTT

MQTT, the Message Queue Telemetry Transport , is a publish/subscribe protocol that’s focused on device data collection. MQTT’s main purpose is telemetry, or remote monitoring, therefore it’s designed to interconnect and retrieve data from thousands of edge devices and transport the aggregated traffic back to the operational and management domain. Therefore, in the general classification, we would consider MQTT to be a device-to-server protocol.

  • “Message Queue Telemetry Transport, is an open message protocol designed for M2M communications that facilitated the transfer of telemetry-style data in the form of messages from pervasive devices, along high latency or constrained networks, to a server or small message broker.”

Therefore, MQTT is based on a hub-to-spoke topology, although not necessarily so, as it is designed to collect data from edge transducers and send the data collected back to a collection server in the operations and management domain. Because of this design, MQTT does not facilitate device-to-device connections and works in a point-to-point relationship between devices and the collection server. As this is a clear design objective of MQTT, it has few configuration options and does not really require any. MQTT’s job specification is simply to collect data from devices and transport the data reliably back to the collection server.

However, that reliable transport requires a reliable transport protocol, so MQTT works over TCP/IP. The implications of that are a full TCP/IP connection is required between the device and the collection server. This is very problematic for a number of reasons; first, it requires the device to be able to support a full TCP/IP stack; second, it requires a reasonable network connection, as the TCP/IP connection must be maintained at all times. Third, using TCP/IP may provide the reliability that MQTT requires but it will affect performance and packet overhead unnecessarily in situations such as a LAN and other non-WAN infrastructures, where there already is a reliable network.

As a result of these design specifications, MQTT is suited to external and remote device monitoring such as monitoring the condition of an oil or gas pipeline, or similar applications where thousands of non-constrained TCP/IP capable sensors/devices require data to be sent to a common server/application. In applications such as remote telemetry monitoring in harsh environments, the TCP/IP protocols reliability is a boon, and the performance is secondary. Consequently, MQTT is not designed for high performance with expected device-to-server figures being counted in seconds.

XMPP (Jabber)

XMPP stands for the Extensible Messaging and Presence Protocol , which is an open technology for real-time communication developed to deliver a wide range of applications. XMPP was developed to focus on the advent of technologies such as instant messaging, presence, and video conferencing, which were dominated by Skype and WhatsApp. There was also the possibility of even entering into collaboration, for developing lightweight middleware, and content syndication.

However, to understand XMPP, we need to understand why it was originally designed. That was for use in instant messaging, to detect presence, and allow people to connect to other people and exchange text messages. Consequently, XMPP was designed to use protocols that make human style communication easier. For example, it uses XML as its native type and an addressing scheme that is intuitive to humans. The addressing format is [email protected], which facilitates people-to-people communications, as it is an easily recognizable format common to e-mail and IM programs.

In the context of the IIoT, XMPP may have some useful features such as its user-friendly addressing of devices. It will be easy for a human controller to identify and address devices using a smartphone and a simple URL.

However, we have to recognize that XMPP was initially designed for human usage therefore; it was not designed to be fast. Indeed most deployments of XMPP use polling, or even only-on-demand to check for presence. Therefore, XMPP’s performance is based on human perception of real time, which is in seconds rather than micro-seconds.

In addition, XMPP communicates over HTTP riding on TCP/IP, which combined with the XML payload, will make this protocol suitable for smartphones and intelligent devices where the protocol overheads are not a problem. Small low-power devices will not have the processing ability to handle a cocktail of TCP/IP, HTTP, and XML. Consequently, XMPP is best suited for industrial processes that have human-managed interfaces that favor security, addressability, and scalability over real-time performance.

AMQP

The Advanced Message Queuing protocol (AMQP) is not strictly a publish/subscribe protocol but rather as its name suggests a queuing protocol. AMQP comes not from IoT but has its roots in the financial and banking industry. As a result, AMQP can be considered a battle hardened queuing protocol that delivers high levels of reliability even when managing queues of thousands of messages, such as banking transactions.

Due to its roots in the banking industry, AMQP is designed to be highly reliable and capable of tracking every message or transaction that it receives. As a result, AMQP operates over TCP/IP but also requires strict acknowledgment of message receipts from the recipient.

However, despite not being a true publish/subscribe protocol, AMQP is highly complementary to other publish/subscribe protocols, as it delivers highly reliable message queuing and tracking, which is a requirement in some IIoT use-cases. AMQP is also deployed at the server level in the operations and management domain to help with analytics and data management.

DDS

The Data Distribution Service (DDS) targets, in contrast to the other publish/subscribe models, devices that directly use device data. For example rather than being a device-to -server protocol, where servers harvests data from devices in a point-to-point or star topology, as with MTTQ, DDS distributes data to other devices on a bus so it is considered a device-to-device protocol.

While interfacing with other devices is the design purpose of DDS and this means fast communication and collaboration between devices on the same segment, the protocol also supports device-to-server interaction.

DDS’s main purpose is to connect devices to other devices. It is a data-centric middleware standard with roots in high-performance technologies, space exploration, defense, and industrial embedded applications. DDS can be remotely accessible and efficiently publish millions of messages per second to many simultaneous subscribers. Furthermore, DDS can store and forward messages, if a subscriber happens to be offline.

The concepts behind how DDS handles the publish/subscribe model is that devices require data in real time as devices are fast. In this context, “real time” is often measured in microseconds. This is often because in IIoT scenarios devices will need to communicate with many other devices in complex ways, to fulfill time-critical applications. Therefore, TCP/IP’s slow and reliable point-to-point connection-orientated data streams are far too slow and restrictive. Instead, DDS offers detailed quality-of-service (QoS) control, multicast, configurable reliability, and pervasive redundancy. DDS also provides ways to filter and publish to specific subscribers and they can be thousands of simultaneous destinations, without the time delay of the broker model. DDS can also support lightweight versions of DDS that run in constrained environments such as on low-power devices.

DDS is capable of meeting the demands of high-performance IIoT systems because DDS implements a direct device-to-device “bus” communication with a relational data model. This relational data is called a “data bus ” and it is similar to the way a database controls access to stored data. In so much as it efficiently controls access to and ensures the integrity of the shared data, even though the data is accessible by many simultaneous users. This exact level of data control in a multi-device network is what many high-performance devices need to collaborate as a single system.

DDS is designed for high-performance systems so it’s a perfect match for IIoT applications such as manufacturing systems, wind farms, hospital integration, medical imaging, asset-tracking systems, and automotive test and safety.

Other publish/subscribe protocols use TCP/IP for reliability. But TCP/IP needs, in order to provide reliable communication, that end-to-end session to be established in order to transmit data at that time. UDP, on the other hand, is connectionless. It fires and forgets. UDP is quick but is inherently unreliable.

However, what if you do need to have reliable data transfer but a connection is just not feasible all of the time? UDP might seem a good prospect, as it is fast and UDP can broadcast to many receivers without too much overhead. However, you don’t want to lose valuable data when communications are not possible. Remember that UDP sends once and that is it. Similarly, under conditions where establishing a connection is not possible, TCP/IP will be unable to transmit data. It is only after it establishes a session will it transmit current data. Therefore, the ideal protocol for IIoT scenarios would be one that has store and forward capabilities.

Delay Tolerant Networks (DTN)

This was the problem that NASA encountered when sending space probes into space. TCP/IP was not acceptable as there would not always be a means to form a connection with the space probe, and UDP was also useless as valuable data would just be lost, on a fire and forget basis. The problem that NASA had was how they could ensure that even though there was no radio connection with the probe—for instance, when it disappeared behind a planet—that the data generated by the probe, which they could store locally, could be delivered to ground control in its entirety, once it came into radio contact.

The result of their research was the adoption of the protocol, DTN, which has among its other attributes, store and forward capabilities. DTN is the preferred solution for very remote and mobile IIoT devices and applications and it forms the basis of the middleware protocols that we use today in real-time Industrial Internet situations.

DTN turns out to be perfect for many IoT applications as it works on a store, carry, and forward basis (see Figure 8-2). For example, the sending node is the sensor in IoT, and it collects and stores the data (store) until it comes into contact (carries) with another node, to which it can pass on the information (forward). The recipient node then stores the data and carries it on its travels until it can forward the data to another node. In this way the data gets passed up through the network until it reaches its destination. The point being here that large variable delays are tolerated as the primary concern is delivering the data to the destination over an intermittent and unreliable network.

A416110_1_En_8_Fig2_HTML.jpg
Figure 8-2. DTN works on a store, carry, and forward basis

DTN has many applications in the IoT world, for example interplanetary Internet, wildlife monitoring, battlefield communications, and Internet communication in rural areas. When rural villages have no communication infrastructure, communication can still be possible albeit slowly. How it works is the village builds a booth that contains a PC and communication equipment running DTN protocol. The villagers' messages are stored locally on the DTN device. The local bus that services the region and interconnects the rural villages carries Wi-Fi routers that connect/peer to the DTN equipment when they come into range.

As a result the messages are forwarded from the DTN village terminal to the data equipment on the bus. The bus then carries the data with it on its journey from village to village, collecting and storing more data as it goes. Eventually when the bus returns to the town or city depot, it forwards the data it has collected on its journey to an Internet connected router and the data is delivered to the eventual destination over the Internet. The return Internet traffic is handled in a similar manner, with the bus delivering the return traffic (messages) back to each village on it next journey. This is simply a digital version of how mail was delivered once upon a time.

DTN can store-carry-forward data across a long distance without any direct connection between source and destination. Although it could be said that in the previous example a bus was the path between the source and the destination and wasn't random. When handling communication in random networks, for example when performing wildlife tracking, there is no regular bus passing by so we have to make use of other relay nodes to store, carry, and forward the data. When we use these mobile ad-hoc networks, there are several tactics that can be deployed to increase efficiency. One method is called the epidemic technique, and this is where the holder of the data passes it on to any other relay node that it comes across. This method can be effective but wasteful of resources and high on overhead as there could be multiple copies of the data being carried and forwarded throughout the network.

One solution to the epidemic techniques inefficiency is called Prophet , which stands for Probabilistic Routing Protocol using History of Encounters and Transitivity. Prophet mitigates some of epidemic’s inefficiency by using an algorithm to try to exploit the non-randomness of travelling node encounters. It does this by maintaining a set of probabilities that the node it encounters has a higher likelihood of being able to deliver the data message than itself.

Spray and Wait is another delivery method that sets a strict limit to the number of replications allowed on the ad-hoc network. By doing so, it increases the probability of successfully delivering the message while restricting the waste of resources. It does this because when the message is originally created on the source node, it is assigned a fixed number of allowed replications that may co-exist in the network. The source node is then permitted to forward the message only to that exact number of relay nodes. The relay nodes once they accept the message then go into a wait phase whereby they carry the message—they will not forward it to other relays—until they directly encounter the intended destination. Only then will they come out of the wait state and forward the message to the intended recipient.

As we have just seen with DTN, IoT can work even in the most remote areas and inhospitable environments. Because it does not need constant connectivity, it can handle intermittent ad-hoc network connections and long delays in delivery. However, that is in remote areas transmitting small amounts of data within specialist applications—wildlife tracking and battlefield conditions, for example. In most IoT applications, we have another issue, and that is handling lots of data—extremely large amounts of data.

Therefore, for most Industrial Internet use-cases, the emphasis will be on reliable scalable systems using deterministic, reliable protocols and architectures such as the ones currently deployed in M2M environments in manufacturing settings.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.130.13