9. Async APIs for Eventing and Streaming

The key to safety lies in the encapsulation. The key to scalability lies in how messaging is actually done.

Alan Kay

Images

Figure 9.1 The Design Phase offers several options for API styles. This chapter covers asynchronous API design.

Most discussions that surround web-based APIs center on synchronous, request/response interaction styles common with REST, query-based APIs, and RPC. They are easy to understand and approachable for developers and non-developers with minimal experience working with HTTP.

Yet, synchronous APIs have their limitations. The API server is unable to inform interested parties about changes in the representation of a resource or notify when a workflow between multiple parties have completed. Instead, they require the client to initiate the interaction with an API server before receiving any notifications.

Asynchronous APIs, or async APIs, unlock the full potential of a digital product or platform. They extend the API conversation from client-originated to server-originated, allowing clients to react to an event rather than start a conversation. New capabilities may be built based upon a single type of event notification. And all of this may be done without the involvement of the team that owns an API.

Including async API design as part of an overall API design effort empowers teams to craft new solutions based on notifications or data streams. But it takes a few considerations to unlock the full potential of an async API. This chapter presents some of the challenges and design patterns around designing async APIs. It also demonstrates how to design and document an async API by building upon the previous API modeling steps outlined in Chapter 6.

The Problem with API Polling

If an API client wishes to know when new data is available, it must periodically check with the server to see if any new resources have been added or existing resources have been modified. This pattern is known as API polling and is a common solution for clients that need to become aware of new resources or modifications to existing resources.

API polling is flexible and may be implemented by the client on top of just about any API that uses a request-response style. However, API polling isn’t an ideal solution. Coding the logic necessary to detect and track modifications is complex, wasteful, and can result a poor user experience. The API client must send a GET request to a resource collection to fetch the latest list of resources, compare the list to the last list retrieved by the API client, and determine if anything new has been added. Some APIs offer an operation to provide recent changes based on a timestamp since the last request, but it is up to the API client to continue to perform API polling to determine when changes have been made.

Yet, many developers are forced to build API polling code to constantly check for changes in server-side state. Building polling code includes additional challenges to the developer:

■ The API sends responses back with default, non-optimal sorting, e.g., oldest-to-newest. The consumer must then request all entries to find out if anything new is available, often keeping a list of the IDs already seen to determine what is new

■ Rate limiting may prevent making requests at the desired intervals to detect change in a timely fashion

■ The data offered by the API doesn’t provide enough details for the client to determine if a specific event has occurred, such as a resource modification

The ideal situation is to have servers inform any interested API consumers about new data or recent events. However, this isn't possible with traditional request-response API styles common with HTTP as API clients are required to submit a request before the API server can communicate any changes.

Async APIs help address this need. Rather than constantly polling and implementing change detection rules on the API client, API servers send asynchronous push notifications to interested API clients when something on the server has changed. This opens a whole new series of possibilities compared to traditional, synchronous web-based APIs that are rooted in HTTP request/response.

Async APIs Create New Possibilities

As discussed in Chapter 1, APIs provide interfaces to data and behavior to deliver digital capabilities, typically over HTTP. Digital capability examples include a customer profile search, customer registration, and attaching a customer profile to an account. These digital capabilities are combined to create API products and API platforms that empower business units within an organization, partners, and customers to create new outcomes.

Async APIs are digital capabilities as well. They go beyond traditional REST-based web APIs to open new possibilities for digital business:

Reacting to business events in real-time: Solutions can react to internal state changes and critical business events when they happen.

Extending the value of solutions with message streams: Additional value is unlocked from existing solutions and APIs. New opportunities emerge to take advantage of internal events by surfacing them alongside the capabilities offered by their APIs. New solutions are built on top of existing APIs through an event-driven interaction style.

Improving API Efficiency: Constant API polling is no longer needed to check for state changes. This reduces the resources required to support an API by pushing state change events to those interested, thereby reducing infrastructure costs.

Case Study: GitHub Webhooks Created a New CI/CD Marketplace

GitHub Webhooks have been around for some time, allowing teams to be notified when new code has been pushed to a GitHub-hosted repository. While Git supports writing scripts to react to these kinds of events within a source code repository, GitHub was one of the first vendors to turn these script-based hooks into Webhooks. Any individual or organization hosting their code with GitHub could be notified, via an HTTP-based POST, when new code was available and trigger a new build process.

Over time, continuous integration and delivery (CI/CD) tools that were previously restricted to on-premises installation could now be offered via a software-as-a-service (SaaS) model. These solutions would be granted permission to receive the Webhook-based notification and start a new build process.

This one async API notification ultimately created an entire SaaS market of hosted CI/CD tools. That is the power of async APIs.

Before the full potential of async APIs can be unlocked, it is important to understand messaging fundamentals.

A Review of Messaging Fundamentals

Messages contain data that are published by a message producer to a message receiver. Receivers may be a local function or method, another process on the same host, a process on a remote server, or middleware such as a message broker.

There are three common types of messages: commands, replies, and events:

■ A command message requests work must be done immediately or in the future. Command messages are often imperative: CreateOrder, RegisterPayment, etc. Command messages are sometimes referred to as request messages.

■ A reply message provides the result, or outcome, of a command message. Reply messages often add the suffix 'Result' or ‘Reply’ to differentiate them from their command counterparts: CreateOrderReply, RegisterPaymentResult, etc. Reply messages are also referred to as response messages. Not all command messages result in a reply message.

Event messages tell the receiver about something that happened in the past. A good event name represents an action in the past: OrderCreated, PaymentSubmitted, etc. Event messages are typically used when a business event has occurred, a workflow state has changed, or data has been created or modified.

Messages Are Immutable

It is important to note that messages are immutable. Once they are published, they may not be modified. Therefore, messages that require modification must be republished as a new message. If necessary, include a correlation identifier to map the new message to the original message.

Figure 9.2 shows an example of each kind of message and the context that it provides.

Images

Figure 9.2 Examples of the three primary types of messages.

Messaging Styles and Locality

An application or service may choose from one or more styles of messaging:

Synchronous messaging involves the message producer sending a message and waiting while the receiver processes it and returns a reply.

Asynchronous messaging allows the message producer and receiver to operate in their own time, rather than waiting upon one another. The message producer sends the message to the receiver, but the receiver may not be able to process it immediately. The message producer is free to perform other tasks while waiting for a reply from the message receiver.

Additionally, messages may be exchanged across different localities:

Local messaging assumes that messages are sent and received within the same process. As such, the programming language and host will be the same as well. The Smalltalk programming language was built to support sending and receiving messages between objects. Actor-based frameworks, such as Vlingo, also support this kind of messaging. A "mailbox" sits between the code that produces the message and the code that will process the message. The consumer code processes each message as soon as possible, sometimes using threads or dedicated CPU cores to process multiple messages in parallel.

Inter-process messaging exchange messages between separate processes but on the same host. Examples include Unix sockets or dynamic data exchange (DDE).

Distributed messaging involves two or more hosts for messaging. Messages are transmitted over a network using the desired protocol. Examples of distributed messaging include message brokers using AMQP, MQTT, SOAP-based web services, REST-based APIs, etc.

The combination of synchronous or asynchronous messaging styles, along with the locality of the messaging determines the possibilities of a message-based solution.

The Elements of a Message

When a discussion around message design emerges, most of the focus is on the message body. The message body is usually in a structured format, such as JSON or XML, though binary or plain text are also valid. Some organizations choose to wrap the message body within a message envelope that contains useful metadata about the message contents and the message publisher.

There is more to a message than just the message body, however. Messages may also include transport protocol semantics. Network protocols such as HTTP, MQTT, or AMQP include message headers with details such as: creation timestamps, time-to-live, priority/quality of service, etc. A message is not fully described unless it includes all necessary information to process the message over the protocol. Figure 9.2 demonstrates the elements of each message exchanged between an API client and API server for a REST-based API.

Images

Figure 9.3 A REST API example that shows the elements of the request and response messages exchanged between the API client and the API server.

Understanding Messaging Brokers

Message brokers act as an intermediary between message producers and message receivers. The result is a more loosely coupled design, as producers are only aware of the message broker but not the components ultimately receiving the messages. Examples of message brokers include RabbitMQ, ActiveMQ, and Jmqtt.

Message brokers also offer additional features such as:

Transactional boundaries that ensure messages are only published or marked as delivered only if the transaction is committed

Durable subscriptions that store messages prior to dispatching to message receivers. Undeliverable messages, perhaps due to the message receiver being offline, are stored on the client’s behalf until they reconnect (i.e. the store and forward pattern)

Client acknowledgement mode specifies how a message is considered acknowledged by the client to provide flexibility in balancing performance with failure recovery. A message is considered dispatched successfully either 1) automatically upon delivery or 2) upon client acknowledgement that the message was processed successfully

Detection of message processing failures by dispatching messages to a different receiver in the event of a failure or outage of a message receiver, based on the client acknowledgement mode configuration

■ A dead letter queue (DLQ) that store messages that could not be processed due to unrecoverable errors by message receivers. Allows automated or manual review and processing of failed message delivery

Message priority and time-to-live (TTL) that assist the message broker in prioritizing the delivery of messages and removing unprocessed messages if they exceed a specific period of time without being processed

Standards-based connectivity through the AMQP protocol, along with optimized protocols for Java via JMS and other language bindings

Message brokers offer two methods of message distribution: point-to-point and fanout.

Point-to-point message distribution (queues)

Point-to-point messaging allows a publisher to send a message to a single subscriber selected from a pool of registered subscribers. The broker is responsible for selecting the subscriber that will receive the published message for processing via a round robin or similar selection process. Only one subscriber will receive a message published to the queue. If the subscriber fails to process the message within a given timeout period, the broker will select a new subscriber for message processing. Figure 9.4 demonstrates an example of a point-to-point queue.

Images

Figure 9.4 A point-to-point queue that dispatches each message to a single message receiver subscribed to the queue.

Point-to-point queues are useful for publishing command messages that should have only one consumer processing a message at a time to ensure consistency, predictability, and to avoid duplicate message processing. This is a common pattern for background job processing, where each job should only be processed once by a pool of workers.

Fanout message distribution (topics)

Fanout messaging allows every message published to a topic to be distributed to every subscriber currently registered. The broker doesn’t care if the message was processed by all subscribers, or just a subset of them. Unlike point-to-point queues, a message will be processed by all subscribers.

Images

Figure 9.5 A fanout topic that dispatches each message to all subscribed message receivers.

All topic subscribers will receive a copy of each published message. This supports a variety of processing logic for each published event message. Subscribers are not aware of each other or the publisher, only that a new message has been sent to them for processing.

A Note About Message Broker Terminology

The use of queues and topics within this chapter is commonly found in resources about distributed messaging. Keep in mind that some vendors, such as RabbitMQ, offer more distinct options for topics. Options range from general broadcasting of messages, termed fanout, to selective broadcasting, which they term topics. Be sure to read the vendor documentation carefully to understand the terminology they prefer to achieve the desired goals.

Message Streaming Fundamentals

Message brokers are most often transactional and are designed to manage the state of durable subscriptions for failure recovery of offline receivers. While useful for a number of application and integration solutions, transactional support and other characteristics limit the scalability of traditional message brokers.

Message streaming builds upon the decades of message broker expertise but shifts some responsibilities away from the server while adding new capabilities to address today's complex data and messaging needs. It uses a fanout pattern for push notification of new messages to one or more subscribers, much like message broker topics. Examples of streaming servers include Apache Kafka, Apache Pulsar, and Amazon Kinesis.

Unlike message brokers, subscribers may request messages at any point from the topic's available message history. This allows for the replay messages or for simply picking up where processing previously left off. Unlike message brokers, most streaming servers shift state management from the server to the client. The client is now responsible for tracking the last message seen. Error recovery is also pushed to the client, forcing the client to resume processing messages at the last known message.

Support for this style of interaction is accomplished by shifting message management from a traditional queue or topic to an append-only log. These logs may store all messages or limit the history of messages for specified retention period. A topic using a distributed log with two consumers is shown in Figure 9.6

Images

Figure 9.6 A streaming topic comprised of a distributed log of recorded messages, consumed by two separate consumers who have two separate offsets to reflect their current message.

Since clients are able to specify the offset of where they wish to start, they are able to solve new kinds of problems that were not possible with message brokers:

■ Near real-time data processing and data analytics as soon as incoming data is received from other systems or third-parties due to the highly scalable and low latency design of message streaming servers

■ Use historical messages to verify the results of code changes prior to pushing new code to production

■ Execute experimental data analytics against historical messages

■ Remove the need to subscribe to all message broker queues and topics in an effort to store all messages processed by a message broker for auditing purposes

■ Push data into a data warehouse or data lake for consumption by other systems, without the need for traditional extract-transform-load (ETL) processes

The higher scalability of message streaming lends itself to a shift in the way data is managed and shared. Rather than sharing access to a data store or replicating the data store, each new or modified data event message is pushed to a topic stream. Any consumers are then able to process the data change, including storing it locally for caching or for further analysis.

Message Streaming Considerations

There may be certain circumstances when message streaming may not be the best option:

■ Duplicate message processing. Subscribers must keep track of their current location in the stream. Therefore, duplicate message processing must be expected and handled. This may be the case if the current location was not able to be stored prior to a failure.

■ No message filtering. Message brokers support filtering messages on a queue or topic based on specific values. Message streaming does not support this "out of the box". Instead, it requires receivers to process all messages from a given offset, or to apply a third-party solution, such as Apache Spark.

■ Authorization is limited. Since message streaming is relatively new, fine-grained authorization control and filtering is limited or non-existent for today's solutions. Be sure to verify authorization needs are satisfied by the chosen vendor before proceeding. There are solutions beginning to emerge that bridges streams with REST, which may allow API gateways to apply more rigorous authorization strategies.

Async API Styles

Async APIs are an API interaction style that allows the server to inform the consumer when something has changed. There are a variety of API styles that support asynchronous APIs: webhooks, Server-Sent Events (SSE), and WebSockets are the most common.

Server Notification Using Webhooks

Webhooks allow API servers to publish notifications to other interested servers when an event has occurred. Unlike traditional callbacks, which occur within the same codebase, webhooks occur over the web, using an HTTP POST. The term webhooks was coined by Jeff Lindsay in 2007. Since then, the REST Hooks patterns have been developed to offer a standard way to manage and secure webhook subscriptions and notifications.

Webhooks are dispatched when the API server sends a POST request to a URL that is provided by the system wishing to receive the callbacks. For example, an interested subscriber may register to receive new task event notifications on a specific URL they provide, e.g. https://myapp/callbacks/new-tasks. The API server then sends a POST request to each subscriber's callback URL with a message containing the event details. The full sequence is shown in Figure 9.6.

Images

Figure 9.7 An API server’s webhook dispatcher that sends a message to each registered URL that wishes to receive the callback using HTTP POST.

Webhooks must be network accessible by the API server and must be able to host an API server of its own to receive the POST requests. As such, webhooks are a good fit for server-to-server communication between systems but not useful for browser and mobile applications.

Implementing Webhooks Effectively

Webhooks require a variety of considerations, including handling delivery failures, securing communications between client and server, and callbacks that take too long to acknowledge the notification. Refer to the REST Hooks documentation for tips on implementing webhook servers effectively.

Server Push Using Server-Sent Events

Server-sent events, or SSE for short, is based on the EventSource browser interface standardized as part of HTML5 by the W3C. It defines the use of HTTP to support longer-lived connections to allow servers to push data back to the client. These incoming messages contain event details that are useful to the client.

SSE is a simple solution that supports server-push notification while avoiding the challenges of API polling. While SSE was originally designed to support pushing data to a browser, it is becoming a more popular way to push data to a mixture of browsers and server-side subscribers.

SSE uses a standard HTTP connection but holds onto the connection for a longer period of time rather than disconnecting immediately. This connection allows API servers to push data back to the client when it becomes available.

The specification outlines a few options for the format of the data coming back, allowing for event names, comments, single or multi-line text-based data, and event identifiers.

Subscribers submit a request to the SSE operation using a GET with the media type of text/event-stream. Existing operations are therefore able to support both standard request-response interactions using JSON, XML, and other supported media types using content negotiation. Clients interested in using SSE may do so by specifying the SSE media type instead of JSON or XML in the Accept request header.

Once connected, the server will then push new events, separated by a newline. If the connection is lost for any reason, the client is able to re-connect to start receiving new events. Clients may provide the Last-Event-ID HTTP header to recover any missed events since the last event ID seen by the client. This is useful for failure recovery.

Images

Figure 9.8 Using server-sent events (SSE) to allow API servers to push events to the client over a long-lived connection. Connections may be resumed using the Last-Event-Id request header.

The format for the data field may be any text-based content, from simple data points to single-line JSON payloads. Multiple lines may be provided using multiple data prefixed lines.

SSE supports several use cases:

■ State change notifications to a front-end application, such as a browser or mobile app, to keep a user interface in sync with the latest server-side state

■ Receiving business events over HTTP, without requiring access to an internal message broker or streaming platform such as RabbitMQ or Kafka

■ Enabling clients to process data incrementally, rather than all-at-once, by streaming long-running queries or complex aggregations results as they become available

SSE does have a few cases where it may not be a fit:

■ The API gateway isn’t capable of handling long-running connections or has a brief timeout period (e.g. less than 30 seconds). While this isn’t a showstopper, it will require the client to reconnect more often.

■ Some browsers do not support SSE. Refer to Mozilla’s list of compatible browsers for more information

■ Bi-directional communication between client and server is required. In this case, WebSockets may be a better option as SSE is server push only

The W3C SSE specification is easy to read and offers additional specifications and examples.

Bi-Directional Notification via WebSockets

WebSockets support the tunneling of a full-duplex protocol, called a sub-protocol, within a single TCP connection that is initiated using HTTP. Because they are full-duplex, bi-directional communication becomes possible between API clients and servers. Clients are able to push requests to the server over a WebSocket, all while the server is able to push events and responses back to the client.

WebSockets is a standardized protocol maintained by the Internet Engineering Task Force as RFC 6455. Most browsers support WebSockets, making it easy to use for browser-to-server, server-to-browser, and server-to-server scenarios. Since WebSockets are tunneled through HTTP connections, they can also overcome proxy restrictions found in some organizations.

An important factor to keep in mind is that WebSockets don't behave like HTTP, even though it uses HTTP to initiate the connection. Instead, a sub-protocol must be selected. There are many subprotocols officially registered with IANA. WebSockets support both text and binary format sub-protocols. Figure 9.9 shows an example WebSocket interaction using a plain text sub-protocol.

Images

Figure 9.9 An example interaction between an API client and server using WebSockets and a plain text subprotocol to create a chat application.

WebSockets are more complex to implement but supports bi-directional communication. This means that they allow clients to send data to the server as well as receive data pushed from the server using the same connection. While SSE is easier to implement, clients are not able to send requests on the same connection making WebSockets a better option when full duplex communication is necessary. Keep this in mind when choosing an async API style.

gRPC Streaming

TCP protocol is optimized for long-lived bi-directional communications. HTTP/1.1 was built on top of TCP but required multiple connections to allow clients to achieve concurrency. While this multi-connection requirement is easy to load balance and therefore scale, it has a considerable impact as each connection requires establishing a new TCP socket connection and protocol negotiation.

HTTP/2 is a new standard built upon the work of the SPDY protocol by Google to optimize portions of HTTP/1.1. Part of the optimization includes request and response multiplexing. In response multiplexing, a single HTTP/2 connection is used for one or more simultaneous requests. This avoids the overhead of creating new connections for each request, similar to how HTTP/1.1 supports keep-alive connections. However, HTTP/2 multiplexing allows all requests to be sent at once, rather than sequentially using a keep-alive connection.

Additionally, HTTP/2 servers may push resources to the client, rather than requiring the client to initiate the request. This is a considerable shift from the traditional web-based request/response interaction style of HTTP/1.1.

gRPC takes advantage of HTTP/2’s bi-directional communication support, removing the need to separately support request/response alongside WebSockets, SSE, or other push-based approaches on top of HTTP/1.1. Since gRPC supports bi-directional communication, async APIs can be designed and integrated alongside traditional request/response RPC methods using the same gRPC-based protocol.

There are three options for gRPC streaming: client-to-server, server-to-client, and bi-directional. This is illustrated in Figure 9.10.

Images

Figure 9.10 The three gRPC-based streaming options available: client-to-server, server-to-client, and bi-directional.

Like WebSockets, gRPC can send and receive messages and events across a single, full-duplex connection. Unlike WebSockets, there are no subprotocol decisions to be made and supported as gRPC uses Protocol Buffers by default. However, browsers have no built-in gRPC support. The grpc-web project is working on bridging gRPC to browsers, but with limitations. Therefore, gRPC streaming is often limited to service-to-service interactions.

Selecting an Async API Style

While there are several choices available for async APIs, it is important to note that some choices may be a better option than others, depending on the circumstances and constraints of the solution. Below are some considerations for each async API style to help teams determine which style(s) may be the best options for an API:

Webhooks: Webhooks are the only async API style that may be server-originated, i.e., it doesn’t require the client to initiate a connection first. Since subscriptions require being able to receive a POST-based callback, use Webhooks when server-to-server notification is needed. Web browsers and mobile apps are unable to take advantage of Webhooks as they cannot establish an HTTP server to receive the callback. Subscribers that have inbound communication restricted by a firewall will not be able to receive the callback as there won’t be a network path to the callback server.

Server-sent Events (SSE): SSE is typically the easiest to implement on the server and client side, but has limited browser support. It also lacks bi-directional communication between client and server. Use SSE when there is a need for the server-push of events that follows RESTful API design.

WebSockets: WebSockets are more complex to implement due to the need to support one or more sub-protocols, but they support bi-directional communication. They are also more broadly supported across browsers as well.

gRPC streaming: gRPC takes full advantage of HTTP/2, so all infrastructure and subscribers must be able to support this newer protocol to take full advantage of gRPC streaming. Like WebSockets, it offers bi-directional communication. gRPC isn't supported fully by browsers, so gRPC streaming is best suited for service-to-service communication or for APIs that manage and configure infrastructure.

Designing Async APIs

Designing async APIs is similar to the process used to design traditional request/response APIs using a REST, RPC, or query-based style. Begin with the resources identified during the API modeling step, as outlined in Chapter 6. Revisit the events identified while capturing the operation details of each API profile. Then, determine what commands and events would be beneficial for API consumers.

Command Messages

Command messages incorporate all of the details necessary to request another component to perform a unit of work. When designing commands for async APIs, it is important to design the command message with sufficient details to process the request. It may also include a target location where the result message may be published. This target location may be a URL to POST the results, a URI to a message broker topic, or perhaps a URL to a shared object store such as Amazon S3.

When designing commands, it may be easy to use built-in language mechanisms such as object serialization to simplify the development of the command producer and consumer. However, this will limit the systems that will be able to consume and process these commands. Instead, seek to use a language-agnostic message format, such as the UBER hypermedia format, Apache Avro, Protocol Buffers, JSON, or XML.

The following is an example JSON-based command message to request a customer’s billing address to be updated asynchronously:

{
  "messageType": "customerAddress.update",
  "requestId": "123f4567",
  "updatedAt": "2020-01-14T02:56:45Z",
  "customerId": "330001003",
  "newBillingAddress": {
      "addressLine1": "...",
      "addressLine2": "...",
      "addressCity": "...",
      "addressState": "...",
      "addressRegionProvince": "...",
      "addressPostalCode": "..."
   }
}

An additional replyTo field could be provided with the URL for callback, or other subscribers could listen for a customerAddress.updated event to react to the change, perhaps updating the billing address in a third-party system.

Event Notifications

Event notifications, sometimes referred to as “thin events”, notify subscribers that a state change or business event has occurred. They seek to provide only the necessary information sufficient for the subscriber to determine if the event is of interest.

The event subscriber is responsible for fetching the latest representation of the details via an API to avoid using stale data. Providing hypermedia links as part of a thin event help to integrate API operations for retrieving the latest resource representation(s) with async APIs such as events. This is shown in the following example event payload:

{
  "eventType": "customerAddress.updated",
  "eventId": "123e4567",
  "updatedAt": "2020-01-14T03:56:45Z",
  "customerId": "330001003",
  "_links": [
    { "rel": "self", "href":"/events/123e4567" },
    { "rel": "customer", "href":"/customers/330001003" }
  ]
}

Thin events are used for events related to resources that change frequently, forcing the event subscriber to retrieve the latest resource representation to avoid working with stale data. While not necessary, thin events may also include details about the specific properties that changed when an update occurred to help consumers determine if the event is of interest.

Event-Carried State Transfer Events

Event-carried state transfer events contain all available information at the time of the event. This avoids the need to contact an API for the complete resource representation, although additional APIs may be used to augment the data required by the subscriber to perform any processing.

There are a few reasons why event-carried state transfer may be preferred over thin events:

■ Subscribers want a snapshot of the resource associated with the event, rather than the few details and associated hypermedia links offered by thin events

■ Data state changes are using message streaming to support replaying message history, requiring a complete point-in-time snapshot of a resource

■ Messaging via async APIs is used for inter-service communication, requiring the publication of a full resource representation to avoid increased API traffic and tighter coupling between services

It is common for this style of message design to mimic API representation structures whenever possible. Deviation is common when the event must offer the old and new values of any modified properties on an update event.

Finally, use nested rather than flat structures to group related properties for medium-to-large payloads. This helps drive evolvability as property names are scoped to the parent property, avoiding property name collisions or long property names to clarify relationships. The following is a demonstration of a flat structure to event-carried state transfer message styles:

{
  "eventType": "customerAddress.updated",
  "eventId": "123e4567",
  "updatedAt": "2020-01-14T03:56:45Z",
  "customerId": "330001003",
  "previousBillingAddressLine1": "...",
  "previousBillingAddressLine2": "...",
  "previousBillingAddressCity": "...",
  "previousBillingAddressState": "...",
  "previousBillingAddressRegionProvince": "...",
  "previousBillingAddressPostalCode": "...",
  "newBillingAddressLine1": "...",
  "newBillingAddressLine2": "...",
  "newBillingAddressCity": "...",
  "newBillingAddressState": "...",
  "newBillingAddressRegionProvince": "...",
  "newBillingAddressPostalCode": "...",
  ...
}

A more structured approach is demonstrated in the following example:

{
  "eventType": "customerAddress.updated",
  "eventId": "123e4567",
  "updatedAt": "2020-01-14T03:56:45Z",
  "customerId": "330001003",
  "previousBillingAddress": {
      "addressLine1": "...",
      "addressLine2": "...",
      "addressCity": "...",
      "addressState": "...",
      "addressRegionProvince": "...",
      "addressPostalCode": "..."
  },
  "newBillingAddress": {
      "addressLine1": "...",
      "addressLine2": "...",
      "addressCity": "...",
      "addressState": "...",
      "addressRegionProvince": "...",
      "addressPostalCode": "..."
          },
  ...
}

When applying structured composition to the event-carried state transfer style, the consumer is able to reuse value objects to contain the details of each nested object and easily detect differences in fields or visualize the changes within a user interface at a later date. Without the pattern, a large value object plus additional coding effort is required to associate the flattened fields for performing things such as detecting a difference between the previous and new address.

Event Batching

While most async APIs are designed to notify a subscriber when each message is available, some designs may benefit from grouping events into a batch. Event batching requires that subscribers handle one or more messages within each notification. A simple example is to wrap the notification with an array an enclose each message within the response, even if there is only one event message at the time:

[
  {
    "eventType": "customerAddress.updated",
    "eventId": "123e4567",
    "updatedAt": "2020-01-14T03:56:45Z",
    "customerId": "330001003",
    "_links": [
      { "rel": "self", "href":"/events/123e4567" },
      { "rel": "customer", "href":"/customers/330001003" }
    ]
  },
...,
...
]

Another design option is to provide an envelope that wraps each batch of events along with additional meta data about the batch:

{
 "meta": {
   "app-id-1234",
   ...
 },
 "events": [
  {
    "eventType": "customerAddress.updated",
    "eventId": "123e4567",
    "updatedAt": "2020-01-14T03:56:45Z",
    "customerId": "330001003",
    "_links": [
      { "rel": "self", "href":"/events/123e4567" },
      { "rel": "customer", "href":"/customers/330001003" }
    ]
  },
  ...,
  ...
  ]
}

Keep in mind that batching messages or events allows for grouping based on a specific timeframe, number of events per batch, or through other grouping factors.

Event Ordering

Most event-based systems offer delivery of messages in order when possible. However, this may not always be the case. Event receivers may go offline and must restore missing messages while also accepting new inbound messages as they arrive. Or the message broker is unable to provide the guarantee of ordered message delivery. In complex distributed systems, multiple brokers and/or message styles may be used in combination, making it difficult to keep messages in order.

When event ordering is necessary, considerations must be made into message design. For a single message broker, the broker may offer message sequence numbering or timestamp-based ordering using the timestamp of when the message was received. In distributed architectures, the timestamp cannot be trusted as each host may have slight variations in system time, called clock skew. This requires a centralized sequence generation technique to be used and assigned to each message.

Be sure to factor order needs into the message design and across various architectural decisions. This may include the need to research and understand distributed synchronization using techniques such as a Lamport Clock to overcome clock skew across distributed nodes while ensuring proper ordering of messages across hosts.

Documenting Async APIs

The AsyncAPI specification is a standard for capturing definitions of async messaging channels. AsyncAPI supports traditional message brokers, server-sent events (SSE), Kafka and other message streams, and internet of things (IoT) messaging such as MQTT. This standard is becoming popular as a single solution to define message schemas and the protocol binding specifics of message-driven protocols. It is important to note that this specification isn’t related to the OpenAPI Specification (OAS), but has been inspired by it and strives to follow a similar format to make adoption easier.

Listing 9.1 demonstrates an Async API description file with message definitions for the Shopping API’s notification events, modeled in Chapter 6.

Listing 9.1 An example AsyncAPI definition of Shopping API events


#
# Shopping-API-events-v1.asyncapi.yaml
#
asyncapi: 2.0.0
info:
  title: Shopping API Events
  version: 1.0.0
  description: |
    An example of some of the events published during the bookstore's
shopping cart experience... channels: books.searched: subscribe: message: $ref: '#/components/messages/BooksSearched' carts.itemAdded: subscribe: message: $ref: '#/components/messages/CartItemAdded' components: messages: BooksSearched: payload: type: object properties: queryStringFilter: type: string description: The query string used in the search filter categoryIdFilter: type: string description: The category ID used in the search filter releaseDateFilter: type: string description: The release date used in the search filter CartItemAdded: payload: type: object properties: cartId: type: string description: The cartId where the book was added bookId: type: string description: The book ID that was added to the cart quantity: type: integer description: The quantity of books added

Keep in mind that the AsyncAPI specification also supports the addition of protocol bindings for each channel’s publish and subscribe messages. This flexibility allows the same message definition to be used across multiple messaging protocols including message brokers, server-sent events, message brokers, and message streams. Visit https://asyncapi.com for more information on the specification and additional resources to help get started using this async API description format. For example asynchronous API descriptions, refer to the API workshop examples available on GitHub.

Summary

Teams can benefit from shifting the API design approach from strictly request-response APIs to thinking in terms of how APIs can offer both synchronous request/response operations and events. These events enable the API to push notifications to other teams that can build entirely new capabilities and perhaps product offerings on top of the original API. The result will be increased innovation and more transformative APIs as part of an API product or API platform initiative.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset