Chapter 6

Design Request and Response Message Representations

Having defined API endpoints and their operations in the previous chapter, we now look into the request and response messages that the API clients and providers exchange. These messages are a key part of the API contract; they bring or break interoperability. Large and rich messages might be very informative, but they also add runtime overhead; small and terse messages might be efficient to transport, but they might not be understood easily and may cause clients to emit follow-up requests to fully satisfy their information needs.

We start with a discussion of related design challenges and then introduce patterns responding to these challenges. The patterns are presented in two sections, “Element Stereotypes” and “Special-Purpose Representations.”

This chapter corresponds to the Design phase of the Align-Define-Design-Refine (ADDR) process that we reviewed at the beginning of Part 2.

Introduction to Message Representation Design

API clients and providers exchange messages, usually represented in textual formats such as JSON or XML. According to our domain model introduced in Chapter 1, “Application Programming Interface (API) Fundamentals,” these messages contain content representations that may be rather complex. The basic structure patterns introduced in Chapter 4, “Pattern Language Introduction”—ATOMIC PARAMETER, PARAMETER TREE, ATOMIC PARAMETER LIST, and PARAMETER FOREST—help define the names, types, and nesting of these request and response message elements. In addition to the payload (or body) of the messages, most communication protocols offer other ways to transport data. For example, HTTP allows transmitting key-value pairs as headers but also as path, query, and cookie parameters.

One might think that knowing about these different ways to exchange information is sufficient to design request and response messages. But if we look closer, we can detect recurring usage patterns in the message representation elements, leading to the following questions:

What are the meanings of the message elements? Can these meanings be stereotyped?

Which responsibilities within conversations do certain message elements have? Which quality goals do they help satisfy?

The patterns in this chapter answer these questions by first inspecting individual elements and then looking at composite representations for specific usage scenarios.

Challenges When Designing Message Representations

Two overarching themes for the patterns in this chapter are message size and conversation verbosity because these factors determine the resource consumption in API endpoint, network, and clients. Security, as a cross-cutting concern, is also influenced. The following architectural decision drivers also have to be taken into account:

  • Interoperability on protocol and message content (format) level, as influenced by the communication platforms and the programming languages used by consumer and provider implementation (for example, during parameter marshaling and unmarshaling).

  • Latency from API consumer/client point of view, for instance, determined by the network infrastructure (its bandwidth and the latency of the underlying hardware in particular) and endpoint processing effort for marshaling/unmar-shaling the payload and delivering it to the API implementation.

  • Throughput and scalability are primarily API provider concerns; response times should not degrade even if provider-side load grows because more clients use it (or existing clients cause more load).

  • Maintainability, especially extensibility of existing messages, and the ability to deploy and evolve API clients and providers independently of each other. Modifiability is an important sub-concern of maintainability (for example, backward compatibility to promote parallel development and deployment flexibility).

  • Developer convenience and experience on both the consumer and the provider sides, defined in terms of function, stability, ease of use, and clarity (including learning and programming effort). The wants and needs of these two sides often conflict. For instance, a data structure that is easy to create and populate might be difficult to read; a compact format that is light in transfer might be difficult to document, prepare, understand, and parse.

For some of these concerns, their impact on representations appearing in APIs is obvious; for others, the relationship will become clear as we proceed through this chapter. We cover detailed forces in the individual pattern texts that follow.

Patterns in this Chapter

DATA ELEMENTS are the fundamental building blocks of any client-provider communication, representing domain model concepts in request and response messages. By exposing the Published Language of the API [Evans 2003] through an explicit schema, provider-internal data definitions are not unveiled, and communication participants can be decoupled as much as possible.

Some of these DATA ELEMENTS have special missions because certain communicating participants appreciate or require additional information that is not part of the core domain model. This is the purpose of METADATA ELEMENTS. Frequently used types of METADATA ELEMENTS are control metadata, provenance metadata, and aggregated metadata.

Questions of identity arise in different parts of the API: endpoints, operations, and elements inside messages may require identification to prevent misunderstandings between decoupled clients and providers. ID ELEMENTS can be used to distinguish communication participants and API parts from each other. ID ELEMENTS can be globally unique or be valid within a certain constrained context. When they are network accessible, ID ELEMENTS turn into LINK ELEMENTS. LINK ELEMENTS often come in the form of Web-style hyperlinks, for instance, when working with HTTP resource APIs.

Many API providers want to identify the communication participants from which they receive messages. Such identity information helps determine whether a message originates from a registered, valid customer or some unknown client. A simple approach is to instruct clients to include an API KEY in each request message that the provider evaluates to identify and authenticate the client.

Combinations of basic DATA ELEMENTS result in more complex structures. One such example is an ERROR REPORT, a common message structure comprising DATA ELEMENTS, METADATA ELEMENTS, and ID ELEMENTS to report communication and processing faults. ERROR REPORTS state what happened when and where but also have to make sure not to disclose provider-side implementation details.

Context information is often transmitted in application- or transport-protocol-specific places. Sometimes it is useful to assemble a CONTEXT REPRESENTATION out of METADATA ELEMENTS that can be placed in the message payload. Such representations may contain ID ELEMENTS, for instance, to correlate requests and responses or subsequent requests.

Figure 6.1 shows the patterns in the chapter and their relations.

Images

Figure 6.1 Pattern map for this chapter: Element stereotypes and their relations to other patterns

Element Stereotypes

The four patterns expressing data responsibilities are DATA ELEMENT, METADATA ELEMENT, ID ELEMENT, and LINK ELEMENT. These element stereotypes give meaning to the parts of request and response message representations.

ImagesPattern: DATA ELEMENT

When and Why to Apply

API endpoints and their operations have been identified on a high level of abstraction and refinement. For instance, in forward engineering, the key domain concepts to be exposed and their relationships have been elicited. In the context of system evolution and modernization, it has been decided to open up a system or provide a view on the content of a database or backend system via API endpoints and their operations.

An API “goals canvas” [Lauret 2019], an API “action plan” [Sturgeon 2016b], or another type of candidate endpoint list [Zimmermann 2021b] has been created; the operation signatures have been defined at least tentatively. The request and response message design is not yet finalized, however.

How can domain- and application-level information be exchanged between API clients and API providers without exposing provider-internal data definitions in the API?

The exchanged data may or may not be involved in reading and writing the provider-side application state and data in the API implementation. Such relations should not be visible to the client.

How can API client and API provider implementation be decoupled from a data management point of view?

In addition to the desire to promote loose coupling, the following competing forces concern whether data elements should be hidden behind the interface or be exposed (partially or fully):

  • Rich functionality versus ease of processing and performance: The more data and behavior is modeled and exposed in an API and its underlying domain model, the more data processing options for the communication participants arise. However, it also becomes increasingly complex to read and write to instances of the domain model elements accurately and consistently. Interoperability is at risk, and the API documentation effort increases. Remote object references and procedure invocation stubs might be convenient to program against and be supported in tools, but they quickly make the communication stateful. Statefulness, in turn, violates SOA principles and microservices tenets.

  • Security and data privacy versus ease of configuration: Letting a communication partner know many details about an application and its data introduces security threats such as the risk that data is tampered with. Extra data protection, on the other hand, can cause configuration and processing effort. Security-related information might have to travel with request and response payload and therefore becomes part of the technical part of the API DESCRIPTION.

  • Maintainability versus flexibility: The data contract and its implementations should be flexible to accommodate continuously changing requirements; however, any new feature and change to an existing feature has to be analyzed with respect to compatibility issues and, if implemented, be maintained in the future (if clients still use it). To satisfy the information needs of different clients, API operations sometimes offer different data representations in a customizable way. The customization means must be designed, implemented, documented, and taught. All possible combinations must be tested and supported as the API evolves. Hence, the provided flexibility means may increase maintenance efforts.1

1. Also see discussions about SEMANTIC VERSIONING, API DESCRIPTION (including technical service con-tracts), WISH LIST, and SERVICE LEVEL AGREEMENT in Chapters 7, 8, and 9.

One could send plain, unstructured strings to be interpreted by the consumer, but in many domains, such an ad hoc approach to API design is not adequate. For instance, when integrating enterprise applications, it couples API client and provider tightly and may harm performance and auditability.

One could use object-based remoting concepts such as Common Object Request Broker Architecture (CORBA) or Java RMI (Remote Method Invocation), but remoting paradigms based on distributed objects have been reported to lead to integration solutions being difficult to test, operate, and maintain in the long term [Hohpe 2003].2

2. Distributed objects and other forms of remote references are core concepts in the integration style “Remote Procedure Invocation” [Hohpe 2003].

How It Works

Define a dedicated vocabulary of DATA ELEMENTS for request and response messages that wraps and/or maps the relevant parts of the data in the business logic of an API implementation.

In domain-driven design (DDD) terms, such a dedicated vocabulary is called Published Language [Evans 2003]. It shields the DDD Aggregates, Entities, and Value Objects in the domain layer. With respect to the concepts in our domain model, introduced in Chapter 1, DATA ELEMENTS describe a general role for the message representation elements (also known as parameters).

The DATA ELEMENT can be flat, unstructured ATOMIC PARAMETERS or ATOMIC PARAMETER LISTS. Basic DATA ELEMENTS may form the leaves of PARAMETER TREES; more complex ones often contain an ID ELEMENT and feature a number of domain-specific attributes as additional structured or unstructured values. Single or multiple instances of these data elements that jointly comprise the application state may be exposed; if multiple instances are managed and transferred jointly, they form an element collection [Allamaraju 2010; Serbout 2021], also known as element set.

Explicit schema definitions for the message representation elements should be defined (and shared with the API clients) in the API DESCRIPTION.3 Open, tool-supported formats such as JSON or XML are commonly used in these data contracts. Exemplary instances should be provided of data that pass the schema validation. The schema can promote strong typing and validation but also be rather generic and only weakly typed. Key-value lists are often used in generic interfaces <ID, key1, value1, key 2, value 2, ... keyn, valuen>.

3. According to our API domain model from Chapter 1, these data transfer representations (DTRs) are wire-level equivalents of program-level data transfer objects (DTOs) described by [Daigneau 2011] and [Fowler 2002].

Figure 6.2 sketches two types of DATA ELEMENTS with exemplary attributes, placed in message representations. One is typed, one is generic.

Images

Figure 6.2 DATA ELEMENTS can either be generic or typed, providing supplemental information optionally

The attributes of DATA ELEMENTS can be role-stereotyped, for instance, into “descriptive attributes,” “time-dependent attributes,” “life cycle state attributes,” and “operational state attributes,” according to Rebecca Wirfs-Brock in her presentation “Cultivating Your Design Heuristics” [Wirfs-Brock 2019, p. 39].

To support nesting and structuring of entities in the API operations, for instance, following a relationship from an order to the purchased product and the buying customer in an online shop, an EMBEDDED ENTITY can be included. Alternatively, a LINKED INFORMATION HOLDER might reference a separate API endpoint. An EMBEDDED ENTITY contains one or more nested DATA ELEMENTS, while a LINKED INFORMATION HOLDER contains navigable LINK ELEMENTS that point at API endpoints that provide information about relationship target(s) such as INFORMATION HOLDER RESOURCES.

Variants Two variants of this pattern are worth calling out. An Entity Element is a DATA ELEMENT that contains an identifier hinting at an object life cycle in the implementation of the Published Language of the API (hence, our terminology here is in line with the “Entity” pattern in [Evans 2003]).

A Query Parameter is a DATA ELEMENT that does not represent one or more entities owned and managed by the API implementation. Instead, it represents an expression that can be used to select a subset of such entities when exposing a RETRIEVAL OPERATION in an endpoint, for instance, an INFORMATION HOLDER RESOURCE.

Example

The following excerpt from the solution-internal API of a customer relationship management (CRM) system features strongly typed DATA ELEMENTS: a structured one, name, and a flat, textual one, phoneNumber (contract notation: Microservice Domain-Specific Language (MDSL), introduced in Appendix C):

data type Customer {
  "customerId": ID,
    "name": ("first":D<string>, "last":D<string>),
    "phoneNumber":D<string>
}

endpoint type CustomerRelationshipManagementService
  exposes
      operation getCustomer
        expecting payload "customerId": ID
        delivering payload Customer

Customer is a PARAMETER TREE that combines the two data elements. The example also features an ATOMIC PARAMETER and ID ELEMENT: customerId. Note that these data representations might have been specified in a domain model first; that said, the domain model elements used for the API implementation should not be exposed directly without wrapping and/or mapping them; loose coupling of the client, interface, and implementation is desired.

Discussion

A rich, deeply structured Published Language is expressive but also hard to secure and maintain; a simple one can be taught and understood easily but may not be able to represent the domain specifics adequately. This set of trade-offs makes API design hard; answering the data contract granularity question is nontrivial.

Reasonable compromises regarding these conflicting forces require an iterative and incremental approach to pattern selection and adoption; best practices on DDD in service design have been published and should be considered [Vernon 2013]; Appendix A summarizes some of these and adds our own insights. The use of many domain-driven DATA ELEMENTS makes APIs expressive so that clients can find and use what they need easily.

Security and data privacy can be improved by exposing as few DATA ELEMENTS as possible. Lean interfaces also promote maintainability and ease of configuration (that is, flexibility on the provider side). “Less is more” and “if in doubt, leave it out” are rules of thumb when defining secure data contracts in APIs. The “less is more” philosophy may limit expressiveness, but it promotes understandability. The entity data must be included in any security analysis and design activities, such as threat modeling, security and compliance by design, penetration testing, and compliance audits [Julisch 2011]. This is an essential point because sensitive information may be leaked otherwise.

Using the same DATA ELEMENT structures across an entire API or a set of in-house services allows for easier composition of the services. Enterprise Integration Patterns calls such an approach a “Canonical Data Model” but advises to handle it with care [Hohpe 2003]. One can consider microformats [Microformats 2022] in such a standardization effort.

If many related/nested DATA ELEMENTS are defined, some of which are optional, processing becomes complicated; performance and testability are impaired. While client-side flexibility is high initially, things get difficult when the rich API starts to change over time.

Organizational patterns (and antipatterns), such as the “not invented here” syndrome and “fiefdoms” or “power games,” often lead to overengineered, unnecessarily complex abstractions. Simply exposing such abstractions via a new API (without putting “Anti-Corruption Layers” [Evans 2003] in place that hide complexity) is bound to fail in the long run. Project schedules and budgets are at risk in such cases.

Related Patterns

A DATA ELEMENT can contain instances of the “Value Object” pattern in DDD [Evans 2003] in transit; DDD “Entity” is represented as a variant of our pattern. That said, one should be aware that instances of DDD patterns should not be translated into API designs one-to-one. While an Anti-Corruption Layer can protect the downstream participant in a relation (here, API client), the upstream (here, API provider) should design its Published Language in such a way that undesired coupling is minimized [Vernon 2013].

It might make sense to have different representations for the same entity depending on the context it is used in. For example, a customer is a widespread business concept modeled as an entity in many domain models; typically, many of its attributes are relevant only in certain use cases (for example, account information for the payment domain). In that case, a WISH LIST can let clients decide what information they want. In HTTP resource APIs, content negotiation and custom media types provide flexible realization options for multipurpose representations. The “Media Type Negotiation” pattern in Service Design Patterns is related [Daigneau 2011].

Core J2EE Patterns [Alur 2013] presents a “Data Transfer Object” pattern for use within an application boundary (for example, data transfer between tiers). Patterns of Enterprise Application Architecture [Fowler 2002] touches on many aspects of remote API design, such as Remote Facades and DTOs. Similarly, Eric Evans touches on functional API aspects in DDD patterns such as “Bounded Contexts” and “Aggregates” [Evans 2003]. Instances of these patterns contain multiple entities; hence, they can be used to assemble DATA ELEMENTS into coarser-grained units.

The general data modeling patterns in [Hay 1996] cover data representations but focus on data storage and presentation rather than data transport (therefore, the discussed forces differ from ours). Domain-specific modeling archetypes for enterprise information systems also can be found in the literature [Arlow 2004].

The “Cloud Adoption Patterns” Web site [Brown 2021] has a process pattern called “Identify Entities and Aggregates.”

More Information

Chapter 3 in the RESTful Web Services Cookbook gives representation design advice in the context of HTTP; for instance, recipe 3.4 discusses how to choose a representation format and a media type (with Atom being one of the options) [Allamaraju 2010].

Design Practice Reference features DDD and related agile practices eligible in API and data contract design [Zimmermann 2021b].

Context Mapper clarifies the relationships between strategic DDD patterns in its domain-specific language (DSL) and tools [Kapferer 2021].

ImagesPattern: METADATA ELEMENT

When and Why to Apply

The request and response message representations of an API operation have been defined using one or more of the basic structure patterns ATOMIC PARAMETER, ATOMIC PARAMETER LIST, PARAMETER TREE, and PARAMETER FOREST. To process these representations accurately and efficiently, message receivers require their name and type, but also appreciate more information about their meaning and content.

How can messages be enriched with additional information so that receivers can interpret the message content correctly without having to hardcode assumptions about the data semantics?

In addition to the quality concerns discussed at the beginning of this chapter, the impact on interoperability, coupling, and ease of use versus runtime efficiency have to be considered.

  • Interoperability: If data travels with corresponding type, version, and author information, the receiver can use this extra information to resolve syntactic and semantic ambiguities. For example, one representation element might contain a monetary value, and an extra element might specify the currency of this value. The fact that an optional element is not present or that a mandatory element is not set to a meaningful value can also be indicated by extra information.

  • Coupling: If runtime data is accompanied by additional explanatory data, it becomes easier to interpret and process; the shared knowledge between consumer and provider is made explicit and shifted from the design-time API contract to the runtime message content; this may add to the coupling of the communication parties but may also decrease it. Low coupling eases long-term maintenance.

  • Ease of use versus runtime efficiency: Extra representation elements in the payload may help the message recipient to understand the message content and process it efficiently. However, such elements increase the message size; they require processing and transport capacity and have an inherent complexity. The API test cases have to cover its creation and usage. A client that hardcodes assumptions about data semantics (including their meaning and any restrictions that might apply) be easier to write, but it will be harder to maintain over time as requirements change and the API evolves.

The extra data that explains other data can be provided solely in the API DESCRIPTION. Such static and explicit metadata documentation often is sufficient; however, it limits the ability of the message receiver to make metadata-based decisions dynamically at runtime.

A second API endpoint could be introduced to inquire about metadata separately. However, such an approach bloats the API and introduces additional documentation/training, testing, and maintenance effort.

How It Works

Introduce one or more METADATA ELEMENTS to explain and enhance the other representation elements that appear in request and response messages. Populate the values of the METADATA ELEMENTS thoroughly and consistently; use them to steer interoperable, efficient message consumption and processing.

Metadata and metadata modeling are mature and well-established concepts in many fields in computer science, for example, databases and programming languages under terms such as runtime type information, reflection, and introspection. In the real world, book libraries and document archives apply it extensively.

Many instances of this pattern are simple and scalar ATOMIC PARAMETERS with a name and a type (such as Boolean, integer, or string), but metadata can also be aggregated and composed into PARAMETER TREE hierarchies. A flexible but somewhat error-prone solution is to represent METADATA ELEMENTS as pairs of key-value strings that are then parsed and typecasted at the message recipient.

Figure 6.3 shows the pattern in context. METADATA ELEMENTS become part of the API DESCRIPTION. They have to be kept current while an API evolves, both on the specification (schema) level and on the content (instance) level. The metadata cur-rentness (or freshness) should be specified to balance usefulness for the client with effort to compute and keep up to date. Some metadata, such as the original creator of a document, might be immutable. For some metadata, for example, list counters, it might make sense to define an expiration date. Interoperability might suffer otherwise, and semantic mismatches might remain undetected.

Images

Figure 6.3 Usage of METADATA ELEMENTS (data about data) in context

Variants Three variants of this pattern exist, representing particular types and usage of metadata we observed in APIs:

  • Control Metadata Elements such as identifiers, flags, filters, hypermedia controls, links, security information (including API KEYS, access control lists, role credentials, checksums, and message digests) steer the processing. Query parameters can be seen as a special case of control metadata when driving the behavior of the query engine on the provider side. Control metadata often comes in the form of Boolean, strings, or numeric parameters.

  • Aggregated Metadata Elements provide semantic analyses or summaries of other representation elements. Calculations such as counters of PAGINATION units qualify as instances of this variant. Statistical information about entity elements in the Published Language, such as insurance claims by customer or product sales per quarter, also qualifies as aggregated metadata.

  • Provenance Metadata Elements unveil the origin of data. In our context of API design, examples include owner, message/request IDs, creation date and other timestamps, location information, version numbers, and other context information.

These variants are visualized in Figure 6.4. Other forms of METADATA ELEMENTS exist (covered later).

Images

Figure 6.4 METADATA ELEMENT variants

Each METADATA ELEMENT can realize more than one of the variants. For instance, a region code might give provenance information but also be used to control data processing. Such data might act as a filter in a digital rights management scenario or be used in “Context-Based Routers” in enterprise application integration [Hohpe 2003].

In information management, three main types of metadata are used to describe any type of resource (such as a book or multimedia content) [Zeng 2015]: Descriptive metadata has purposes such as discovery and identification; it can include elements such as title, abstract, author, and keywords. Structural metadata indicates how compound information elements are put together, for example, how pages (or sections) are ordered to form chapters. Administrative metadata provides information to help manage a resource, such as when and how it was created, what its file type and other technical properties are, and who can access it. Two common subsets of administrative data are rights management metadata, including intellectual property rights, and preservation metadata, which contains information used to archive a resource.

Example

The following example from the Lakeside Mutual case study shows all three metadata types: provenance (Content-Type, Date), control (the API KEY b318ad-736c6c844b in the header), and aggregated metadata (size).

curl -X GET --header 'Authorization: Bearer b318ad736c6c844b' 
--verbose http://localhost:8110/customers?limit=1
> GET /customers?limit=1 HTTP/1.1
> Host: localhost:8110
> User-Agent: curl/7.77.0
> Accept: */*
> Authorization: Bearer b318ad736c6c844b
>
< HTTP/1.1 200
< ETag: "0fcf9424c411d523774dc45cc974190ff"
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 1; mode=block
< Content-Type: application/hal+json
< Content-Length: 877
< Date: Fri, 19 Nov 2021 15:10:41 GMT
<
{
  "filter": "",
  "limit": 1,
  "offset": 0,
  "size": 50,
  "customers": [ {
    
  } ],
  "_links": {
    "self": {
      "href": "/customers?filter=&limit=1&offset=0"
    },
    "next": {
      "href": "/customers?filter=&limit=1&offset=1"
    }
  }
}

Most METADATA ELEMENTS are ATOMIC PARAMETERS in this example. The JSON object _links forms a simple PARAMETER TREE that bundles two ATOMIC PARAMETERS serving as LINK ELEMENTS.

Discussion

Accuracy usually improves when the pattern is applied (assuming a correct and consistent implementation). Coupling decreases on the data level but is still there on the metadata level. Ease of use can be achieved.

Processing efficiency may suffer due to the increased message sizes. Maintainability, security, and interoperability may improve but also suffer depending on the amount, structure, and meaning of the metadata. Excessive use of such METADATA ELEMENTS risks bloating an interface and making it more challenging to maintain and evolve (for example, in terms of SEMANTIC VERSIONING).

Defined, populated, exchanged, and interpreted wisely, METADATA ELEMENTS can streamline receiver-side processing (by avoiding unnecessary work), improve computation results and their display (by steering/guiding the application frontend and the human user), and contribute to an end-to-end security model that protects the communication participants from external and internal threats. Security metadata may serve as input to encryption/decryption algorithms, support integrity checks, and so on.

Metadata can reside and be defined in several of the logical layers defined in Patterns of Enterprise Application Architecture [Fowler 2002]. PAGINATION, for instance, is a presentation layer or service layer concern; the business logic layer of the provider-side API implementation does not care about it. The same holds for caching of previous responses. Access/access control type of metadata typically also is created and used on the presentation or service layer. Data provenance and validity information such as video/audio owners and intellectual property rights in media streaming APIs and certain types of control metadata belong to the business logic layer. Query statistics and aggregations, on the other hand, can be seen as data access layer (or persistence layer) information. If lower-layer metadata is already present, API designs must decide whether to pass this metadata on or to convert and wrap it (tradeoff: effort versus coupling).

The client should depend on metadata only when it is necessary to satisfy mandatory functional and nonfunctional requirements. In all other cases, the available metadata should be treated as an optional convenience feature to make API usage more efficient; the API and its clients should still function if the metadata is not present. For example, control metadata such as PAGINATION links and related page counts will make the client depend on it once introduced. Some aggregation metadata, such as the size of embedded entity collections, can be calculated on the message receiver side rather than on the provider side alternatively.

An alternative to adding metadata to request or response messages is to foresee a dedicated operation returning metadata about a particular API element. In such a design, an ID ELEMENT or LINK ELEMENT identifies the data that is supplemented with metadata; the dedicated operation takes the form of a RETRIEVAL OPERATION. An even more advanced approach is to define dedicated Metadata Information Holders as special types of MASTER DATA HOLDER (or REFERENCE DATA HOLDER if immutable), possibly referenced indirectly via LINK LOOKUP RESOURCES.

ETags in HTTP messages, defined in RFC 7232 [Fielding 2014a] can be seen as control and provenance metadata; expiration dates of one-time-only passwords qualify as metadata too. The CONDITIONAL REQUEST pattern explains and elaborates on ETags in Chapter 7, “Refine Message Design for Quality.

Related Patterns

A METADATA ELEMENT is a specialization of the more abstract concept of a DATA ELEMENT; not all metadata affects business logic and domain model in the API implementation (as explained earlier). ID ELEMENTS sometimes are accompanied by additional METADATA ELEMENTS (for instance, to classify the identifier/link or to define an expiration time). Metadata often comes in the syntactic form of ATOMIC PARAMETERS. Several related instances of the pattern can be transported as ATOMIC PARAMETER LISTS or be included in PARAMETER TREES.

The PAGINATION pattern relies on metadata to inform the client about the current, previous, and next result pages; the total amount of pages/results; and so on. Hypermedia controls such as typed link relations contain metadata as well (as explained later in the LINK ELEMENT pattern).

A “Context Object” to which “Interceptors” can add their information is presented in several pattern languages, including Remoting Patterns [Voelter 2004]. Our CONTEXT REPRESENTATION pattern suggests defining an API-wide, technology-independent standard location and structure for metadata in general and control metadata in particular.

The “Format Indicator” and “Message Expiration” information, both introduced in Enterprise Integration Patterns [Hohpe 2003], rely on metadata. The same holds for control and provenance information such as “message id” and “message date” in messaging APIs such as Jakarta Messaging (formerly JMS). Other enterprise integration patterns, for instance, “Correlation Identifier” and “Routing Slip,” can be seen as special METADATA ELEMENTS. A Correlation Identifier holds control metadata primarily but also shares provenance metadata (because it identifies a previous request message). The same holds for “Return Address” (because it points at an endpoint or channel). “Message Filters,” “Message Selectors,” and “Aggregators” often operate on control and provenance metadata.

More Information

For a general introduction to types of metadata and eligible standards, refer to the following sources:

  • The Wikipedia page on metadata [Wikipedia 2022c]; Wikipedia also lists numerous metadata standards focusing on certain areas, for instance, document identification (DOIs) and security assertions (SAML) [Wikipedia 2022d].

  • Understanding Metadata: What Is Metadata, and What Is It For? [Riley 2017].

  • Dublin Core [DCMI 2020] is a widely adopted metadata standard for networked resources such as books or digital multimedia content.

The information management literature covers metadata in depth. Two examples are “A Gentle Introduction to Metadata” [Good 2002] and Introduction to Metadata [Baca 2016]. Murtha Baca distinguishes fives types of metadata [Baca 2016]:

  • Administrative: metadata used in managing and administering collections and information resources

  • Descriptive: metadata used to identify, authenticate, and describe collections and related trusted information resources

  • Preservation: metadata related to the preservation management of collections and information resources

  • Technical: metadata related to how a system functions or metadata behaves

  • Use: metadata related to the level and type of use of collections and information resources

These metadata types are also summarized in the tutorial Metadata Basics [Zeng 2015].

Our Control Metadata Element variant corresponds to the technical type, and use information often comes as Aggregate Metadata Element. Provenance Metadata Elements often have an administrative, descriptive, or preservation nature.

The Zalando RESTful API and Event Scheme Guidelines [Zalando 2021] point out the importance of OpenAPI metadata. A blog post by Steve Klabnik covers metadata in resource representations [Klabnik 2011].

ImagesPattern: ID ELEMENT

When and Why to Apply

A domain model representing the core concepts of an application, software-intensive system, or software ecosystem has been designed and implemented. Remote access to the domain model implementation is under construction (for instance, as HTTP resources, Web service operations, or gRPC service methods). Architectural principles such as loose coupling, independent deployability, and isolation (of system parts and data) might have been established.

The domain model consists of multiple related elements that have different life cycles and semantics. Its currently chosen decomposition into remotely accessible API end-points (for instance, exposed by a set of microservices) suggests that these related entities should be split up into several API endpoints and operations (for instance, HTTP resources exposing uniform POST-GET-PUT-PATCH-DELETE interfaces, Web service port types with operations, or gRPC services and methods). API clients want to be able to follow the relationships within and across API boundaries to satisfy their information and integration needs. To do so, both design-time artifacts and runtime instances of such artifacts have to be pointed at without ambiguities or mistakes in names.

How can API elements be distinguished from each other at design time and at runtime?

API elements requiring identification include endpoints, operations, and representation elements in request and response messages. They may or may not have been designed with DDD:

When applying domain-driven design, how can elements of the Published Language be identified?

The following nonfunctional requirements have to be satisfied when addressing these identification problems:

  • Effort versus stability: In many APIs, plain character strings are used as logical names. Such local identifiers are easy to create but may become ambiguous when used outside their original context (for instance, when clients work with multiple APIs). They might have to be changed in that case. On the contrary, global identifiers are designed to last longer but require some address space coordination and maintenance. In both cases, the namespace should be designed with care and purpose. Changing requirements might make it necessary to rename elements, and API versions might become incompatible with prior versions. In such cases, certain names may no longer be unique and may therefore cause conflicts.

  • Readability for machines and humans: Humans who work with identifiers include developers, system administrators, and system and process assurance auditors. Long, logically structured, and/or self-explanatory names are more accessible for humans than are short, encrypted, and/or encoded ones. However, humans often do not want to read identifiers in their entirety; for instance, the primary audience for query parameters and session identifiers is the API implementation and supporting infrastructure, not an end user of a Web application.

  • Security (confidentiality): In many application contexts, it should be impossible, or at least extremely hard, to guess instance identifiers; however, the effort to create unique identifiers that cannot be spoofed must be justified. Testers, support staff, and other stakeholders of an API DESCRIPTION may want to be able to understand, and possibly even memorize, identifiers even if they qualify as sensitive information that has to be protected.

One could always embed all related payload data as EMBEDDED ENTITIES, thus avoiding the need to introduce identifiers referencing information that is not included. But this simple solution wastes processing and communication resources if information is transferred that receivers do not require. The construction of a complex, partially redundant payload can also be error-prone.

How It Works

Introduce a special type of Data Element, a unique ID ELEMENT, to identify API endpoints, operations, and message representation elements that have to be distinguished from each other. Use these ID ELEMENTS consistently throughout API description and implementation. Decide whether an ID ELEMENT is globally unique or valid only within the context of a particular API.

Decide on the naming scheme to be used in the API, and document it in the API DESCRIPTION. Following are popular approaches to unique identification:

  • Numeric universally unique identifiers (UUIDs) [Leach 2005] supply ID ELEMENTS in many distributed systems. Often, 128-bit integers serve as UUIDs.

  • Many standard libraries of programming languages can generate them. In some sources, UUIDs are also called globally unique identifiers (GUIDs).

  • Some cloud providers generate human-readable strings to identify service instances uniquely (discussed shortly); such an approach is also feasible for ID ELEMENTS appearing in request and response messages.

  • The use of surrogate key identifiers assigned by lower layers in the overall architecture (for example, operating system, database, or messaging system) is another approach. Primary keys assigned by databases fall in this category.

Instances of the ID ELEMENT pattern often travel as ATOMIC PARAMETERS; they may also become entries in ATOMIC PARAMETER LISTS or leaves of PARAMETER TREES. The API DESCRIPTION specifies the scope of the ID ELEMENT (locally vs. globally unique?) and the lifetime of the uniqueness guarantee. Figure 6.5 shows that ID ELEMENTS are special types of DATA ELEMENTS. URIs and URNs as two types of human-readable strings appear in the figure.

Images

Figure 6.5 ID ELEMENTS come in different forms: UUID, URI, URN, Surrogate Key

Note that identifiers can be made both human- and machine-readable. If identifiers have to be entered by users at times, choose a scheme that creates short names that are easy to pronounce. For instance, see the names for applications created by the cloud provider Heroku; an example is peaceful-reaches-47689). Otherwise, go with numeric UUIDs. The blog site Medium, for instance, uses hybrid URIs as page identifiers; an example of a story URI is https://medium.com/olzzio/seven-microservices-tenets-e97d6b0990a4.

If mandated by the security requirements, make sure that any exposed ID ELEMENT—no matter whether it is a UUID, a human-readable string, or a surrogate key coming from the API implementation—is random and unpredictable and that access to the identified elements is protected by an appropriate authorization mechanism, as recommended by OWASP to prevent broken object-level authorization [Yalon 2019].

URIs are globally unique, for instance, but can be reassigned over time (and then link to unexpected targets, for instance, when used by older clients or when working with restored backup data). Sometimes, Unified Resource Names (URNs) are preferred over URIs, using a hierarchical prefix:firstname:lastname syntax according to RFC 2141 [Moats 1997]:

<URN> ::= "urn:" <NID> ":" <NSS>

<NID> is the namespace identifier, and <NSS> is the namespace-specific string. Examples of URNs can be found on its Wikipedia page [Wikipedia 2022e].

Example

PAGINATION cursors in the Twitter REST API [Twitter 2022] use ID ELEMENTS, for instance, next_cursor:

{
    "data": [...],
    "next_cursor": "c-3yvu1pzhd3i7",
    "request": {...}
}

The API implementation added an autogenerated identifier for the next_cursor in this HTTP response snippet. This identifier must be guaranteed to be unique at least until the user session expires. Also, the association between this identifier and the next cursor position for this user session must be stored so that the correct content is returned when the user requests the next_cursor with this identifier via HTTP GET. This example also shows that the scope of the identifier can be bound not only by space but also by time.

Discussion

ID ELEMENTS such as UUIDs and URNs provide a good balance between being short and easy to process but also expressive enough to identify members of a large entity population and guarantee secure and reliable uniqueness in distributed systems (if constructed and managed correctly). The implementation of the ID generation algorithm determines how accurate they are.

Local identifiers are straightforward to create. Plain string identifiers are easy for humans to process and compare, for example, when debugging. UUIDs are hard to remember and process manually but still are easier to handle than hashed or generated content such as access tokens that may be hundreds of characters long. Using plain and primitive string literals as identifiers is usually not future-proof; systems and system integrations come, change, and go over time. The less expressive names are, the more likely it is that similar or identical names are used elsewhere.

A simplistic approach would be to use auto-incrementing numbers such as sid001, sid002, and so on. But there are several problems with this. Besides leaking information, it is unnecessarily hard to keep these numbers unique in distributed settings (which introduce security threats, discussed later).

Ideally, all identifiers of a certain kind spread across a distributed system should share the same structure or naming scheme; end-to-end monitoring and event correlation during root cause analyses in incident management are simplified this way. Still, sometimes it is preferable (or unavoidable) to switch the scheme for different entities (for instance, when legacy system constraints come into play). This is an instance of a common conflict: flexibility versus simplicity.

UUIDs alone may not be suitable in all cases. UUID generation is implementation-dependent and varies between libraries and programming languages. Although they usually are 128-bits long (according to RFC 4122), some implementations follow a somewhat predictable pattern, making it possible for brute force attackers to guess them. It depends on the project context and requirements whether such “guessabil-ity” is a problem. ID ELEMENTS must be included in any security analysis and design activities such as threat modeling, security and compliance by design, penetration testing, and compliance audits [Julisch 2011].

When multiple systems and components are integrated to realize the API, it is hard to guarantee the uniqueness of surrogate keys from lower logical layers (such as the database implementation) that become API-level ID ELEMENTS. Security concerns also arise. Furthermore, the database keys of the corresponding entities are not allowed to change in this case, even when recovering the database from a backup. Implementation-level surrogate keys couple every consumer tightly to the database.

Related Patterns

An ID ELEMENT can travel as an ATOMIC PARAMETER and be contained in PARAMETER TREES. API KEYS and VERSION IDENTIFIERS can be seen as particular kinds of identifiers. MASTER DATA HOLDERS often require robust identification schemes because of their longevity; OPERATIONAL DATA HOLDERS typically are identified uniquely as well. The data elements returned by REFERENCE DATA HOLDERS may serve as ID ELEMENTS, for instance, zip codes identifying cities (or parts of them). LINK LOOKUP RESOURCES may expect ID ELEMENTS in requests and deliver LINK ELEMENTS in responses; DATA TRANSFER RESOURCES use locally or globally unique ID ELEMENTS to define the transfer units or storage locations; examples of such design can be found in cloud storage offerings such as AWS Simple Storage Service (S3) with its URI-identified buckets.

Local identifiers are not sufficient to implement REST fully (up to maturity level 3). If plain or structured global identifiers turn out to be insufficient, one can switch to using absolute URIs, as described in the LINK ELEMENT pattern. LINK ELEMENTS make remote reference to API elements not only globally unique but also network accessible; they often are used to realize LINKED INFORMATION HOLDERS.

A “Correlation Identifier” and a “Return Address” and the keys used in the “Claim Check” and the “Format Identifier” patterns [Hohpe 2003] are related patterns. Creating unique identifiers is also required when applying these patterns (that have a different usage context).

More Information

“Quick Guide to GUIDs” [GUID 2022] provides a deeper discussion of GUIDs, including their pros and cons.

The distributed systems literature discusses general naming, identification, and addressing approaches (for instance, [Tanenbaum 2007]). RFC 4122 [Leach 2005] describes the basic algorithm for random number generation. XML namespaces and Java package names are hierarchal, globally unique identification concepts [Zimmermann 2003].

ImagesPattern: LINK ELEMENT

When and Why to Apply

A domain model consists of multiple related elements with varying life cycles and semantics. The currently chosen wrapping and mapping of this model in the API suggests that these related entities should be exposed separately.

API clients want to follow element relationships and call additional API operations to satisfy their overall information and integration needs. Following a relationship, for instance, can define the next processing step offered by a PROCESSING RESOURCE or provide more details about the content of an INFORMATION HOLDER RESOURCE that appears in a collection or an overview report. The address where this next processing step can be invoked must be specified somewhere; a mere ID ELEMENT is not sufficient.4

4. Such pointers are required to implement the REST principle of Hypertext as the Engine of Application State (HATEOAS) [Allamaraju 2010] with “Hypermedia Controls” [Webber 2010; Amundsen 2011]. PAGINATION of query response results delivered by RETRIEVAL OPERATIONS requires such control links too.

How can API endpoints and operations be referenced in request and response message payloads so that they can be called remotely?

More specifically:

How can globally unique, network-accessible pointers to API endpoints and their operations be included in request and response messages? How can these pointers be used to allow clients to drive provider-side state transitions and operation invocation sequencing?

The requirements here are similar to those for the sibling pattern ID ELEMENT; endpoint and operation identification should be unique, easy to create and read, stable, and secure. The remoting context of this pattern makes it necessary to deal with broken links and network failures.

One could use simple ID ELEMENTS to identify related remote resources/entities, but additional processing is required to turn these identifiers into network addresses on the Web. ID ELEMENTS are managed in the context of the API endpoint implementation that assigns them. For local ID ELEMENTS to be used as pointers to other API endpoints, they would have to be combined with the unique network address of the endpoint.

How It Works

Include a special type of ID ELEMENT, a LINK ELEMENT, to request or response messages. Let these LINK ELEMENTS act as human- and machine-readable, network-accessible pointers to other endpoints and operations. Optionally, let additional METADATA ELEMENTS annotate and explain the nature of the relationship.

When realizing HTTP resource APIs on REST maturity level 3, add metadata as needed to support hypermedia controls, for instance, the HTTP verb and MIME type that is supported (and expected) by the link target resource.

The instances of the LINK ELEMENT pattern may travel as ATOMIC PARAMETERS; they may also become entries in ATOMIC PARAMETER LISTS or leaves of PARAMETER TREES. Figure 6.6 illustrates the solution on a conceptual level, featuring HTTP URI as a prominent technology-level installment.

Images

Figure 6.6 LINK ELEMENT solution

The links should contain not only an address (such as a URL in RESTful HTTP) but also information about the semantics and consequences of following the link in a subsequent API call:

  • Does the LINK ELEMENTS indicate a next possible or required processing step, for instance, in a long-running business process?

  • Does it allow undoing and/or compensating a previous action?

  • Does the link point at the next slice of a result set (such as a page in PAGINATION)?

  • Does the link provide access to detailed information about a particular item?

  • Or does it allow to switch to “something completely different”?5

5. https://en.wikipedia.org/wiki/And_Now_for_Something_Completely_Different.

Answering the preceding questions, semantic link types typically include the following:

  1. Next: Next processing step when an incremental service type (for example, a PROCESSING RESOURCE) is used.

  2. Undo: Undo or compensation operation in the current context.

  3. More: The address to retrieve more results. This can also be seen as making a horizontal move in result data.

  4. Details: Further information about the link source. Following this link performs a vertical move in the data.

Some link types have been registered and therefore standardized somewhat. See, for instance, the Internet Assigned Numbers Authority’s collection of link relation types [IANA 2020] and Design and Build Great Web APIs: Robust, Reliable, and Resilient by Mike Amundsen [Amundsen 2020].

Application-Level Profile Semantics (ALPS) [Amundsen 2021] can be used to define Web links. Siren [Swiber 2017], another hypermedia specification for representing entities, implements the pattern in JSON. Here is the example given in the Siren repository:

{
    "links":[
        {
            "rel":[
                "self"
            ],
            "href":"http://api.x.io/orders/42"
        }
    ]
}

When using WSDL/SOAP, WS-Addressing [W3C 2004] can be used to define links; when using XML and not JSON, XLink [W3C 2010] is a solution alternative on the platform-specific level.

Example

A paginated response from the Lakeside Mutual Customer Core API that contains many LINK ELEMENTS is shown in the following listing:

curl -X GET --header 'Authorization: Bearer b318ad736c6c844b' 
http://localhost:8110/customers?limit=1
{
  "filter": "",
  "limit": 1,
  "offset": 0,
  "size": 50,
  "customers": [{
    
    "_links": {
      "self": {
        "href": "/customers/bunlo9vk5f"
      },
      "address.change": {
        "href": "/customers/bunlo9vk5f/address"
      }
    }
  }],
  "_links": {
    "self": {
       "href": "/customers?filter=&limit=1&offset=0"
      },
      "next": {
        "href": "/customers?filter=&limit=1&offset=1"
      }
  }
}

The self link in customers can be used to get more information about the customer with the ID bunlo9vk5f, the address.change affords a way to change the customer address, and the self and next links at the end point at the current and next pagination chunk with offsets 0 and 1 respectively.

Discussion

LINK ELEMENTS such as URIs are accurate. When structured nicely, URIs are human-and machine-readable; complex URI schemas are hard to maintain. A solution- or organization-wide URI scheme can promote consistency and ease of use. Using standardized link types such as those defined by IANA improves maintainability, as does structuring LINK ELEMENTS according to the “Web Linking” RFC 8288 [Nottingham 2017]. Using URIs exclusively for resource identification is a REST principle. Global addressability is achieved with decentralized naming.

The pattern solves the “global, timeless, and absolute” identification problem at the cost of a more complicated client-side programming model (which in turn is very flexible). Designing stable, secure URIs is nontrivial from a risk and effort point of view. LINK ELEMENTS such as URIs introduce security threats, therefore the URIs must be included in the security design and testing efforts to ensure that invalid URIs do not crash the server or become entry doors for attackers.

The REST style as such does not distinguish between an ID ELEMENT and a LINK ELEMENT. This has advantages (supposed ease of use and guaranteed addressability), but also drawbacks (it is hard to change URLs). Once URIs have been used in LINK ELEMENTS, it becomes very risky and costly to change the URI scheme (the LINK LOOKUP RESOURCE pattern and HTTP redirects may come to the rescue). Humans browsing the Web can derive link information from the currently displayed HTML page and their intuition about the provided service (or consult the service documentation); API client programs and their developers cannot do this as easily.

Knowing the LINK ELEMENT is not enough to interact with a remote endpoint (such as a resource in RESTful HTTP or a SOAP operation); in addition, details about the endpoint are required to communicate successfully (for example, in RESTful HTTP, the HTTP verb, the request parameters, and the structure of the response body). To ease the communication of the additional details, these details should be defined in the API DESCRIPTION of the service linked to by the LINK ELEMENT and/or included in METADATA ELEMENTS at runtime.

Related Patterns

ID ELEMENT is a related pattern, providing uniqueness of local references to API elements. ID ELEMENTS do not contain network-accessible and therefore globally unique addresses. ID ELEMENTS typically also do not contain semantic type information, as we suggest including in LINK ELEMENTS. Both LINK ELEMENTS and ID ELEMENTS can be accompanied by METADATA ELEMENTS.

LINK ELEMENTS are often used to realize PAGINATION. They can also organize hypermedia-driven state transfers. Either locally valid ID ELEMENTS or full, globally valid LINK ELEMENTS might be returned by STATE CREATION OPERATIONS and STATE TRANSITION OPERATIONS. Using LINK ELEMENTS can be beneficial (or imperative) when realizing distributed business processes as an orchestrated set of STATE TRANSITION OPERATIONS exposed by one or more PROCESSING RESOURCES (such advanced use was discussed as frontend BPM and BPM services in Chapter 5, “Define End-point Types and Operations”).

“Linked Service” [Daigneau 2011] captures a related concept, the target of the LINK ELEMENT. A Pattern Language for RESTful Conversations [Pautasso 2016] features related patterns for RESTful integration such as “Client-side Navigation following Hyperlinks,” “Long Running Request,” and “Resource Collection Traversal.”

More Information

“Designing & Implementing Hypermedia APIs” [Amundsen 2013], a QCon presentation, is a good starting point for investigation. Many examples can be found in the GitHub repositories of the API Academy [API Academy 2022].

Chapter 5 in the RESTful Web Services Cookbook presents eight recipes for “Web Linking” [Allamaraju 2010]. For instance, Section 5.4 discusses how to assign link relation types. Chapter 4 in the same book advises on how to design URIs. Also see Chapter 12 of Build APIs You Won’t Hate [Sturgeon 2016b] for LINK ELEMENTS in HTTP resource APIs on maturity level 3.

The ALPS specification also deals with link representations. It is described, for instance, in Design and Build Great Web APIs [Amundsen 2020]. RFC 6906 is about the “profile” link relation type” [Wilde 2013]. Another draft RFC, called JSON Hypertext Application Language, suggests a media type for link relations. The REST Level 3 Web site [Bishop 2021] suggests profiles and patterns to realize HTTP LINK ELEMENTS.

Libraries and notations that implement the concept include HAL, Hydra [Lanthaler 2021], JSON-LD, Collection+JSON, and Siren; see Kai Tödter’s presentation, “RESTful Hypermedia APIs” [Tödter 2018], and Kevin Sookocheff’s blog post for an overview [Sookocheff 2014].

Special-Purpose Representations

Some element stereotypes are so prevalent in APIs and/or so multifaceted that they warrant their own pattern. One example is the API KEY, which is a mere atomic METADATA ELEMENT from a message representation perspective; however, its application in the security context adds unique forces that must be addressed. Both ERROR REPORT and CONTEXT REPRESENTATION comprise one or more representation elements. Another common trait of the three patterns in this section is a focus on API quality (continued and intensified in the next chapter).

You might be wondering why we touch on security considerations in a chapter on message representation design. We do not aim to provide a complete picture, but we feature API KEYS because they are widely known and used in various APIs. Security is a broad and important topic, and usually, more sophisticated security designs than mere API KEYS are required. We provide pointers to related information in the summary section that ends this chapter.

ImagesPattern: API KEY

When and Why to Apply

An API provider offers services to subscribed, registered participants only. One or more clients have signed up that want to use the services. These clients have to be identified, for instance, to enforce a RATE LIMIT or to implement a PRICING PLAN.

How can an API provider identify and authenticate clients and their requests?

When identifying and authenticating clients on the API provider side, many questions arise:

  • How can client programs identify themselves at an API endpoint without having to store and transmit user account credentials?

  • How can the identity of an API client program be made independent of the client’s organization and program users?

  • How can varying levels of API authentication, depending on security criticality, be implemented?

Conflicts between security requirements and other qualities exist:

  • How can clients be identified and authenticated at an API endpoint while still keeping the API easy to use for clients?

  • How can endpoints be secured while minimizing performance impacts?

For example, the Twitter API offers an API endpoint to update the user status—which means sending a tweet. Only identified and authenticated users should be able to do that, and only for their own accounts.

  • Establishing basic security: An API serving subscribed clients has to associate incoming requests with the corresponding client. Not all API endpoints and operations have the same security requirements, though. For instance, an API provider might just want to enforce a RATE LIMIT, which requires some kind of identification but does not justify the introduction of high-fidelity security features.

  • Access control: Let customers control which API clients can access the service. Not all API clients might need the same permissions, so it should be possible to manage these in a fine-grained way.

  • Avoiding the need to store or transmit user account credentials: An API client could simply send the credentials (for example, a user identifier and password) for its user account with each request (for example, via basic HTTP authentication).6 However, these credentials are used not only for the API but also for account management, for example, to change the payment details. Sending these sensitive credentials through a nonencrypted channel or storing the credentials on a server as part of the API configuration introduces a significant security risk. A successful attack is much more severe if the attacker also gains access to the client’s account and, as a consequence, to billing records or other user-related information.

    6. Basic HTTP Authentication, described in RFC 7617 [Reschke 2015], is an “authentication scheme, which transmits credentials as user-id/password pairs, encoded using Base64.”

  • Decoupling clients from their organization: External attacks can be a major threat. Using the customer’s account credentials as an API security means would also give internal staff (such as system administrators and API developers) full account access, which is not needed. A solution should allow distinguishing between the personnel who administrate and pay for an account from the development and operations teams that configure the client programs.

  • Security versus ease of use: An API provider wants to make it easy for its customers to access its service and get up to speed quickly. Forcing a complex and possibly onerous authentication scheme (for example, SAML,7 which provides powerful authentication functionality) on its clients might discourage them from using the API. Finding the right balance highly depends on the security requirements of the API.

    7. SAML, the Security Assertion Markup Language [OASIS 2005], is an OASIS standard for parties to exchange authentication and authorization information. One application of SAML is implementing single sign-on.

  • Performance: Securing an API can have an impact on the performance of the infrastructure—encrypting requests requires computing, and the date vol-umes increase with any additional payload transmitted for authentication and authorization purposes.

A rich portfolio of application-level security solutions addressing confidentiality, integrity, and availability (CIA) requirements is available. However, for a free and public API their management overhead and performance impact might not be economically feasible. For a SOLUTION-INTERNAL API or a COMMUNITY API, security could be implemented at the network level with a virtual private network (VPN) or a two-way Secure Sockets Layer (SSL). This approach complicates application-level usage scenarios such as enforcing RATE LIMITS.

How It Works

As an API provider, assign each client a unique token—the API KEY—that the client can present to the API endpoint for identification purposes.

Encode the API KEY as an ATOMIC PARAMETER, that is, a single plain string. This interoperable representation makes it easy to send the key in the request header, in the request body, or as part of a URL query string.8 Because of its small size, including it in every request causes only minimal overhead. Figure 6.7 shows an example of a request to a protected API that includes the API KEY b318ad736c6c844b in the Authorization header of HTTP.

8. For security reasons, sending a key in a URL query string is not recommended and should be used only as a last resort. Query strings often show up in log files or analytics tools, compromising the security of the API KEY.

Images

Figure 6.7 API KEY example: HTTP GET with Bearer authentication

Before implementing a custom solution, check whether your framework, or a third-party extension, already offers support for working with API KEYS. Make sure to put automated integration or end-to-end tests into place to ensure that endpoints are accessible only with a valid API KEY.

As the API provider, make sure that the generated API KEYS are unique and hard to guess. This can be achieved by using a serial number (to guarantee uniqueness) padded by random data and signed and/or encrypted with a private key (to prevent guessing). Alternatively, base the key on a UUID [Leach 2005]. UUIDs are easier to use in a distributed setting because there is no serial number to be synchronized across systems. However, UUIDs are not necessarily randomized;9 hence, they also require further obfuscation just like in the serial number scheme.

9. Version 1 UUIDs are a combination of timestamp and hardware addresses. The “Security Consid-erations” section in RFC 4122 [Leach 2005] warns: “Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example.”

An API KEY can also be combined with an additional secret key to ensure the integrity of requests. The secret key is shared between the client and the server but never transmitted in API requests. The client uses this key to create a signature hash of the request and sends the hash along with the API KEY. The provider can identify the client with the provided API KEY, calculate the same signature hash using the shared secret key, and compare the two. This ensures that the request was not tampered with. For instance, Amazon uses such asymmetric cryptography to secure access to its Elastic Compute Cloud.

Example

The following call to a PROCESSING RESOURCE in the Cloud Convert API initiates the conversion of a .docx file from Microsoft Word into PDF. The client creates a new conversion process by informing the provider of the desired input and output format in a STATE CREATION OPERATION. These formats are passed as two ATOMIC PARAMETERS in the body of the request; the input file then has to be provided by a second call to a STATE TRANSITION OPERATION in the same API:

curl -X POST https://api.cloudconvert.com/process 
--header 'Authorization: Bearer gqmbwwB74tToo4YOPEsev5' 
--header 'Content-Type: application/json' 
--data '
{
    "inputformat": "docx",
    "outputformat": "pdf"
}'

For billing purposes, the client identifies itself by passing the API KEY gqmbw-wB74tToo4YOPEsev5 in the Authorization header of the request, according to the HTTP/1.1 Authentication RFC 7235 specification [Fielding 2014b]. HTTP supports various types of authentication; here the RFC 6750 [Jones 2012] Bearer type is used. The API provider can thus identify the client and charge their account. The response contains an ID ELEMENT to represent the specific process, which can then be used to retrieve the converted file.

Discussion

An API KEY is a lightweight alternative to a full-fledged authentication protocol and balances basic security requirements with the desire to minimize management and communication overhead.

Having the API KEY as a shared secret between the API endpoint and the client, the endpoint can identify the client making the call and use this information to further authenticate and authorize the client. Using a separate API KEY instead of the customer’s account credentials decouples different customer roles, such as administration, business management, and API usage, from each other. This makes it possible to let the customer create and manage multiple API KEYS, for example, to be used in different client implementations or locations, with varying permissions associated with them. In the case of a security break or leak, they can also be revoked and a new one generated independently of the client account. A provider might also give clients the option to use multiple API KEYS with different permissions or provide analytics (for example, the number of API calls performed) and RATE LIMITS per API KEY. Because the API KEY is small, it can be included in each request without impacting performance much.

The API KEY is a shared secret, and because it is transported with each request, it should be used only over a secure connection such as HTTPS. If this is not possible, additional security measures (VPN, public-key cryptography) have to be used to protect it and to satisfy the overall security requirements (such as confidentiality and nonrepudiation). Configuring and using secure protocols and other security measures has a certain configuration management and performance overhead.

An API KEY is just a simple identifier that cannot be used to transport additional data or metadata elements such as an expiration time or authorization tokens.

Even when combined with a secret key, API KEYS might be insufficient or impractical as the sole means of authentication and authorization. API KEYS are also not meant to authenticate and authorize users of the application. Consider the case where three parties are involved in a conversation: the user, the service provider, and a third party that wants to interact with the service provider on behalf of the user. For example, a user might want to allow a mobile app to store its data on the user’s Dropbox account. In this case, API KEYS cannot be used if the user does not want to share them with the third party. One should consider using OAuth 2.0 [Hardt 2012] and OpenID Connect [OpenID 2021] instead in this (and many other) scenarios.

More secure alternatives to API KEYS are full-fledged authentication or authorization protocols, where authorization protocols include authentication functionality. Kerberos [Neuman 2005] is an authentication protocol that is often used inside a network to provide single sign-on. Combined with Lightweight Directory Access Protocol (LDAP) [Sermersheim 2006], it can also provide authorization. LDAP itself offers authorization as well as authentication capabilities. Other examples of point-to-point authentication protocols are Challenge-Handshake Authentication Protocol (CHAP) [Simpson 1996] and Extensible Authentication Protocol (EAP) [Vollbrecht 2004]. We come back to this discussion in the chapter summary.

Related Patterns

Many Web servers use session identifiers [Fowler 2002] to maintain and track user sessions across multiple requests; the concept is similar to that of API KEYS. In contrast to API KEYS, session identifiers are used for only a single session and then discarded.

Security Patterns [Schumacher 2006] provides solutions satisfying security requirements such as CIA and discusses their strengths and weaknesses in detail. Access control mechanisms such as role-based access control (RBAC) and attribute-based access control (ABAC) can complement API KEYS and other approaches to authentication. These access control practices require one of the described authentication mechanisms to be in place.

More Information

The OWASP API Security Project [Yalon 2019] and “REST Security Cheat Sheet” [OWASP 2021] should be consulted when securing HTTP resource APIs. The cheat sheet contains a section on API KEYS and contains other valuable information on security as well.

Chapter 15 in Principles of Web API Design addresses ways to protect APIs [Higginbotham 2021]. Chapter 12 of the RESTful Web Services Cookbook [Allamaraju 2010] is dedicated to security and presents six related recipes. “A Pattern Language for RESTful Conversations” [Pautasso 2016] covers two related patterns of alternative authentication mechanism in a RESTful context, “Basic Resource Authentication” and “Form-Based Resource Authentication.”

ImagesPattern: ERROR REPORT

When and Why to Apply

Communication participants have to reliably manage unexpected situations at runtime. For instance, a client has called an API, but the API provider is not able to process this request successfully. The failure could be caused by incorrect request data, invalid application state, missing access rights, or numerous other problems that could be the fault of the client, the provider and its backend implementation, or the underlying communications infrastructure (including the network and intermediaries).

How can an API provider inform its clients about communication and processing faults? How can this information be made independent of the underlying communication technologies and platforms (for example, protocol-level headers representing status codes)?

  • Expressiveness and target audience expectations: The target audience of fault information includes developers and operators as well as help desk and other supporting personnel (in addition to middleware, tools, and application programs). Elaborate error messages suggest better maintainability and evolv-ability; the more they explain, the more helpful they can be when fixing defects because they reduce the effort for finding root causes of failures. However, error messages should not assume any consumer-side context or usage scenario or technology skills due to the diversity of the target audience. They have to find a balance between expressiveness and compactness (brevity); chatty explanations that contain unfamiliar jargon might confuse some recipients and cause a “too long; didn’t read” reaction.

  • Robustness and reliability: Main decision drivers when introducing any kind of error reporting and handling come from the desire to increase robustness and reliability. Error reports must cover many different cases, including errors that occur during error handling and reporting. They should help manage the system and help fix defects.

  • Security and performance: Error codes or messages should be expressive and meaningful to their consumers, but they must not unveil any provider-side implementation details for security and data privacy reasons.10 Provoking errors can be used for denial-of-service-attacks. API providers have to keep track of their performance budgets when reporting errors, security being one reason. Provider-side logging and monitoring also have performance (and storage) costs attached.

    10. When did you last see a SQL exception with full server-side stack trace in a Web page?

  • Interoperability and portability: When reporting errors, the means of the underlying technology should be taken into account. For example, when using HTTP, a suitable response status code allows others (for example, monitoring tools) to make sense of the error. However, to avoid unnecessarily tight couplings, it should not be the sole means of communicating errors. Protocol, format, and platform/technology autonomy as facets of loose coupling [Fehling 2014] should be preserved.

  • Internationalization: Most developers are used to English error messages; if such messages reach end users and administrators, they have to be translated to achieve natural language support (NLS) and to support internationalization.

How It Works

Reply with error codes in response messages that indicate and classify the faults in a simple, machine-readable way. In addition, add textual descriptions of the errors for the API client stakeholders, including developers and/or human users such as administrators.

The ERROR REPORT information takes the structure of an ATOMIC PARAMETER LIST, a two-tuple comprising an error code (which may take the form of an ID ELEMENT) and a textual description. The error codes can be the same as those of the protocol or transport layer, such as HTTP 4xx status codes.

The ERROR REPORT can also contain a correlating ID ELEMENT that allows the provider to analyze a failed request internally; the CONTEXT REPRESENTATION pattern realizes such design in a platform-neutral way. Timestamps are another common information element in ERROR REPORTS too.

Figure 6.8 illustrates the solution building blocks.

Images

Figure 6.8 ERROR REPORT pattern, providing machine- and human-readable information, including provenance metadata

Example

Customers logging in to their Lakeside Mutual accounts have to provide their user-name and password:

curl -i -X POST 
  --header 'Content-Type: application/json' 
  --data '{"username":"xyz","password":"wrong"}' 
  http://localhost:8080/auth

If the credentials are not correct, an HTTP 401 error is returned along with a more detailed response rendered as a JSON object, both assembled by the Spring framework in this example (the status code is repeated and explained with two texts):

HTTP/1.1 401
Content-Type: application/json;charset=UTF-8
Date: Wed, 20 Jun 2018 08:25:10 GMT

{
  "timestamp": "2018-06-20T08:25:10.212+0000",
  "status": 401,
  "error": "Unauthorized",
  "message": "Access Denied",
  "path": "/auth"
}

Similarly, consider the client does not specify the content type of the request body:

curl -i -X POST --data '{"username":"xyz","password":"wrong"}' 
http://localhost:8080/auth

Then the provider will answer with an appropriate error message (again using the Spring defaults):

HTTP/1.1 415
EHDate: Wed, 20 Jun 2018 08:29:09 GMT

{
  "timestamp": "2018-06-20T08:29:09.452+0000",
  "status": 415,
  "error": "Unsupported Media Type",
  "message": "Content type
      'application/x-www-form-urlencoded;
      charset=UTF-8' not supported",
  "path": "/auth"
}

The message tells the developer that the (default) content type of application/x-www-form-urlencoded is not supported by this endpoint. The Spring framework allows customizing the default error reporting.

Discussion

An ERROR REPORT that contains a code allows the API consumer to handle the error programmatically and to present a human-readable message to the end user. By including a textual error message, the error can be explained in more detail than with a protocol- or transport-level code. An elaborate ERROR REPORT response can also contain hints to solve the problem that led to the error, following the conventions from first aid/emergency (911) calls: what happened to whom, where, and when.

Compared to a simple numeric error code, a detailed textual message is at a higher risk to expose provider-side implementation details or other sensitive data accidentally. For example, when informing about a failed login attempt, such a message should not reveal whether the used user ID (for example, an email) actually maps to an account or not, in order to make brute force attacks harder. The textual error message might also have to be internationalized if it can reach a human user.

Explicit error reporting leads to better maintainability and evolvability, and the more it explains errors and thus reduces the effort in the task of finding the cause of a defect, the more effective it is; thus, the ERROR REPORT pattern is more effective in this regard than simple protocol-level error codes. ERROR REPORT also has better interoperability and portability properties, as it promotes protocol, format, and platform autonomy. However, the more elaborate error messages can reveal information that is sensitive with regard to security; such revealing of detailed information about system internals opens up attack vectors.

Transport-level codes can still be used in addition to the payload ERROR REPORTS that aim at becoming independent of the transport protocol. Payload ERROR REPORTS can describe a finer-grained set of errors than possible with a predefined set of transport-level error categories; reporting communication problems with transport-level codes and application/endpoint processing problems in the payload is in line with the general separation of concerns principle.

If the API is capable of responding with an internationalized message, it might be tempting to leave out the error code. But this forces any nonhuman consumer to parse the error message to find out what went wrong; therefore, an error report should always include error codes that are easily machine-readable. Furthermore, this ensures that the client developer can change messages presented to human users.

When reporting errors that occurred when processing a REQUEST BUNDLE, it is desirable to report the error status or success both per entry in the bundle and for the entire bundle. Different options exist; for instance, the error report for an entire request batch can be combined with an associative array of individual error reports that are accessible via a request ID.

Related Patterns

An ERROR REPORT can be part of the CONTEXT REPRESENTATION in response messages. It may contain METADATA ELEMENTS, for instance, those that inform about next possible steps (to work around the reported problem or to correct it).

The “Remoting Error” pattern [Voelter 2004] contains a generalized and more low-level notion of this pattern, focused on the viewpoint of distributed system middleware.

Error reporting is an important building block in making API implementations robust and resilient. Many more patterns are required for a full solution, for instance, “Circuit Breakers,” first described in [Nygard 2018a]. The systems management category [Hohpe 2003] contains related patterns such as “Dead Letter Channel.”

More Information

See Chapter 4 of Build APIs You Won’t Hate [Sturgeon 2016b] for detailed coverage of error reporting in the context of RESTful HTTP.

Production readiness in general is covered in Production-Ready Microservices: Building Standardized Systems across an Engineering Organization [Fowler 2016].

RFC 7807 proposes a standard format for machine-readable error details for HTTP APIs. In addition to using the HTTP status code, the response body contains an Atomic Parameter List providing information about the type of error that occurred (in the form of an URI), a title describing the problem category and a detail element with a more elaborate problem description that includes data of the request.

ImagesPattern: CONTEXT REPRESENTATION

When and Why to Apply

An API endpoint and its operations have been defined. Context information has to be exchanged between API client and provider. Examples of such context information are client location and other API user profile data, the preferences forming a WISH LIST, or quality-of-service (QoS) controls such as credentials used to authenticate, authorize, and bill clients. Such credentials may be API KEYS or JSON Web Token (JWT) claims.

How can API consumers and providers exchange context information without relying on any particular remoting protocols?

Important examples of remoting protocols are application protocols such as HTTP or transport protocols such as TCP. In the context of this pattern, we assume that a concrete protocol has not been selected yet, but it is already clear that some QoS guarantees have to be delivered.

Interactions between API client and API provider might be part of conversations and consist of multiple related operation calls. API providers can also act as API clients that consume services provided by other APIs (in their implementations) to create operation invocation sequences. Some parts of the context information might be local to single operations; others might be shared and handed over from operation invocation to operation invocation in such conversations.

How can identity information and quality properties in a request be made visible to related subsequent requests in conversations?

  • Interoperability and modifiability: Requests may cross multiple compute nodes and travel over different communication protocols on the way from client to provider; the same is true for responses on the way back. It is difficult to ensure that control information exchanged between consumer and provider is able to pass each kind of intermediary (including gateways and service buses) in a distributed system successfully but remains unmodified when the underlying protocol switches. The existence and semantics of predefined protocol headers may change as protocols evolve. Modifiability as a maintainability concern has a business domain and a platform technology facet; here, we are interested in upgradability in particular. A decision about centralization or decentralization of context information may have an impact on this quality.

  • Dependency on evolving protocols: The history of distributed systems and software engineering suggests that protocols and formats keep changing (with a few notable exceptions such as TCP). For example, lightweight messaging protocols such as MQTT can be found in addition to HTTP in Internet-of-Things scenarios. Using protocol-specific headers gives the API client and provider developers maximum control over what happens during transport and saves them from having to implement the QoS property transport and usage themselves. However, this choice also introduces an extra dependency with associated learning effort. In case a protocol is replaced by another as the API evolves, extra maintenance effort is required to port the API implementation.

    To promote protocol independence and a platform-independent design, the default headers and header extension capabilities available in the underlying communication protocol sometimes should not be used.

  • Developer productivity (control versus convenience): Not all API clients and providers have the same integration requirements, and not all of their programmers can be expected to be protocol, networking, or remote communication experts.11 Hence, a control versus convenience trade-off exists when it comes to defining and transporting QoS information and other forms of control metadata: using protocol headers is convenient and makes it possible to leverage protocol-specific frameworks, middleware, and infrastructure (such as load balancers and caches), but it delegates control to the protocol designers and implementers. A custom approach maximizes control but causes development and test effort.

    11. Despite the notion of “full stack developers,” which is often mentioned today.

  • Diversity of clients and their requirements: When different clients use the services of an API for varying use cases, possibly under other circumstances and at different times, some generalization takes place, and points of variability are introduced. In such settings, application- and infrastructure-level context information about the client may be required to route and process requests in client-specific ways, log activities systematically for offline analysis, or propagate security credentials. For example, banking regulations may allow storing and accessing customer data only in the customer’s country. Multinational banks then have to make sure to protect the data accordingly. This can be achieved by putting the client’s country in the context and routing all requests accordingly to the correct national customer management system instance.

  • End-to-end security (across services and protocols): To achieve end-to-end security, tokens and digital signatures must be transported across multiple nodes. Such security credentials are a typical type of metadata that the consumer and provider have to exchange directly; intermediaries and protocol endpoints would break the desired end-to-end security.

  • Logging and auditing on business domain level (across invocations): A business transaction identifier is typically generated when a user request arrives at the first point of contact in a larger distributed system such as a multi-tier enterprise application. This ID ELEMENT is then included in all requests to backend systems, which yields a full audit trace of user requests. For instance, an API Design Guide from Cisco introduces a custom HTTP header called TrackingID for this purpose [Cisco Systems 2015]. This works well if HTTP is used for all message exchanges, but what happens to the TrackingID if pro-tocols are switched as we move down an invocation hierarchy?

How It Works

Combine and group all METADATA ELEMENTS that carry the desired information into a custom representation element in request and/or response messages. Do not transport this single CONTEXT REPRESENTATION in protocol headers, but place it in the message payload.

Separate global from local context in a conversation by structuring the Context Representation accordingly. Position and mark the consolidated CONTEXT REPRESENTATION element so that it is easy to find and distinguish from other DATA ELEMENTS.

The pattern can be realized by defining a PARAMETER TREE to encapsulate METADATA ELEMENTS that comprise the custom CONTEXT REPRESENTATION. Figure 6.9 shows a solution sketch in UML. The resulting PARAMETER TREE structure typically is of low to medium complexity (in terms of nesting level and element cardinalities). While PARAMETER TREES are a common choice, a simple ATOMIC PARAMETER LIST can be used alternatively if the requirements only ask for numbers or enumerations (for instance, keyword classifiers or product codes in the context of a shop API).

Images

Figure 6.9 CONTEXT REPRESENTATION

Examples of the included METADATA ELEMENTS are priority classifiers, session identifiers, correlation identifiers, as well as logical clock values and timers used, for instance, for coordination and correlation purposes (in both request and response messages). Location data, locale, client version, operating system requirements (and so on) also qualify as context information about requests.

One should use the same structure and location across all operations in an API to make the CONTEXT REPRESENTATIONS easy to locate, understand, and process. If the context information differs substantially across operations in an endpoint, an abstraction-refinement hierarchy may model commonalities and variabilities; optional fields and default values may be used as well (which adds development and test effort).

Variants In some settings, context information is processed only locally by the API provider implementation; other context information is passed on to backend systems (with the API provider taking a client role). Some context information may be relevant only for the current call, while other context information is used to coordinate subsequent calls to the same API endpoint.

Hence, two variants of this pattern exist: Global Context Representations (in conversations) and Local Context Representations. API designers are usually concerned about reducing the chattiness of their APIs. However, multiple operations still have to be called in certain scenarios. This can happen in the form of nesting calls. For instance, a microservice might be invoked that calls another service, which possibly calls yet another one. A deep hierarchy makes it difficult to achieve end-to-end reliability, understandability, and performance—especially when the calls are synchronous. In other scenarios, services may have to be called in a particular order, for instance, to realize complex business processes or login procedures (such as fetching an authorization token before a business operation can be called). In both cases, it is necessary to carry context information over to the following API calls. For example, a user credential (or token) might have to be passed on after its creation. The business process identifier (ID) or the original transaction might have to be delegated to services deeper in the call hierarchy to guarantee correct request authorization. Tracing and logging throughout a conversation benefits from such context handover.

Figure 6.10 visualizes operation call nesting.

Images

Figure 6.10 An API provider also acting as API client, requiring context information

When sharing context information as desired, a context can comprise different scopes. Information in the context can be classified as local or global. The local context contains information valid only to this request. This can be message IDs, usernames, time-to-live of this message, and so on. The global context contains information that is valid for longer than a single request, for example, in the context of nested operation calls or within a long-running business process. As mentioned earlier, authentication tokens that are delegated across multiple calls, global transactions, or business process identifiers are examples of context information typically found in a global context. Figure 6.11 illustrates.

Images

Figure 6.11 Scopes for context: Global (conversation) and local (operation, request/response)

This division into local (operation/message-level, that is) and global contexts shared among distributed communication participants is beneficial for reasoning about the stakeholders and lifetime of the context information. The global context is often handled via application-level intermediaries (for instance, API gateways validating, transforming, and/or routing requests) because it is standardized, and the handling of the information is repetitive. Libraries and framework components (such as annotation processors in application servers) can process it alternatively. By contrast, information in the local context is processed by libraries or frameworks at the API implementation level (for instance, server-side support for HTTP and container frameworks such as Spring). The message payload is then analyzed and processed in the API provider implementation.

Example

The following service contract sketch introduces a custom CONTEXT REPRESENTATION called RequestContext in the payload of the request message of the getCustomerAttributes operation. It is decorated with the stereotype <<Context_Representation>> and therefore easily recognizable in the request payload. The API contract notation used in the example is Microservice Domain-Specific Language (MDSL). An MDSL primer and reference is presented in Appendix C:

API description ContextRepresentationExample

data type KeyValuePair P // not specified further
data type CustomerDTO P // not specified further

data type RequestContext {
    "apiKey":ID<string>,
    "sessionId":D<int>?,
    "qosPropertiesThatShouldNotGoToProtocolHeader":KeyValuePair*}

endpoint type CustomerInformationHolderService
  exposes

  operation getCustomerAttributes
    expecting payload {
      <<Context_Representation>> {
        "requestContextSharedByAllOperations": RequestContext,
        <<Wish_List>>"desiredCustomerAttributes":ID<string>+
    },
    <<Data_Element>> "searchParameters":D<string>*
    }
    delivering payload {
      <<Context_Representation>> {
        <<Metadata_Element>> {
          "billingInfo": D<int>,
          "moreAnalytics":D},
        <<Error_Report>> {
          "errorCode":D<int>,
          "errorMessage":D<string>}
      }, {
      <<Pagination>> {
        "thisPageContent":CustomerDTO*,
        "previousPage":ID?,
        "nextPage":ID?}
      }
    }

The RequestContext contains an API KEY as well as a sessionId ID ELEMENT (to be created by the provider upon successful authentication). Additional freeform headers can be added in the key-value part of it. The response payload of get-CustomerAttributes contains a second use of the pattern. Note that the example also features three additional patterns: WISH LIST, ERROR REPORT, and PAGINATION.

When the MDSL contract is transformed into OpenAPI, the preceding example can be rendered as YAML, like this:

openapi: 3.0.1
info:
  title: ContextRepresentationExample
  version: "1.0"
servers: []
tags:
- name: CustomerInformationHolderService
  externalDocs:
    description: The role of this endpoint is not specified.
    url: ""

    paths:
      /CustomerInformationHolderService:
        post:
          tags:
          - CustomerInformationHolderService
          summary: POST
          description: POST
          operationId: getCustomerAttributes
          requestBody:
            content:
            application/json:
              schema:
                type: object
                properties:
                  anonymous1:
                    type: object
                    properties:
                      requestContextSharedByAllOperations:
                        $ref:'#/components/schemas/RequestContext'
                      desiredCustomerAttributes:
                        minItems: 1
                        type: array
                        items:
                        type: string
                 searchParameters:
                   type: array
                   items:
                     type: string
          responses:
            "200":
              description: getCustomerAttributes successful
                execution
              content:
                application/json:
                  schema:
                    type: object
                    properties:
                      anonymous2:
                        type: object
                        properties:
                          anonymous3:
                            type: object
                            properties:

                      billingInfo:
                        type: integer
                        format: int32
                      moreAnalytics:
                        type: string
                  anonymous4:
                    type: object
                    properties:
                      errorCode:
                        type: integer
                        format: int32
                    errorMessage:
                      type: string
             anonymous5:
              type: object
              properties:
                anonymous6:
                  type: object
                  properties:
                    thisPageContent:
                      type: array
                      items:
                        $ref: "#/components
                                    /schemas/CustomerDTO"
                    previousPage:
                      type: string
                      format: uuid
                      nullable: true
                    nextPage:
                      type: string
                      format: uuid
                      nullable: true
components:
  schemas:
    KeyValuePair:
      type: object
    CustomerDTO:
      type: object
    RequestContext:
      type: object
      properties:
        apiKey:
          type: string
        sessionId:
          type: integer
          format: int32
          nullable: true
        qosPropertiesThatShouldNotGoToProtocolHeader:
          type: array
          items:
            $ref: '#/components/schemas/KeyValuePair'

The MDSL specification is much shorter than the OpenAPI one generated from it.

Discussion

Use of this pattern not only promotes context METADATA ELEMENTS from protocol headers into payloads but does so in a non-scattered way. The information in the CONTEXT REPRESENTATION may deal with runtime QoS such as priority classifiers; control metadata and provenance metadata often are included in CONTEXT REPRESENTATIONS appearing in request messages. Exchanging aggregated metadata such as result counts in responses is also possible but is less common.

By representing control information and other metadata in a common form as part of the payload, API client and provider can be isolated/abstracted from changes in the underlying protocol or technology used (for instance, if different protocols such as plain HTTP, AMQP, WebSockets, or gRPC are used). A dependency on a single protocol header format (and protocol support for it) is avoided. A single request traveling through a gateway or proxy could be switched from one protocol to another and therefore lose or modify its original protocol header information along the way. For example, the gRPC-Gateway project [gRPC-Gateway 2022] generates a reverse-proxy server that translates a RESTful JSON API into gRPC; HTTP headers are mapped to gRPC request headers by the proxy. Regardless of such a protocol switch, header information in the payload stays the same and reaches the client.

The introduction of a shared/standardized CONTEXT REPRESENTATION pays off if the information needs of clients and consumers are similar or identical across the entire endpoint or API. If an API is served by only a single transport protocol, an explicit, custom CONTEXT REPRESENTATION leads to one-time-only design and also processing effort; it might be easier to stay with the native, protocol-level way of transmitting the context (such as HTTP headers). Protocol purists may perceive the introduction of custom headers in the payload as an antipattern that indicates a lack of understanding of the protocol and its capabilities. This discussion comes down to the relative priorities of conformance with technical recommendations versus control of one’s API destiny.

A potential downside of explicit CONTEXT REPRESENTATIONS is redundancy, for instance of status codes, in the protocol and the payload. One might have to deal with accidental or deliberate differences. For example, what should a Web client do if it receives a message with HTTP status “200 OK,” but a failure is indicated as part of the payload? What about the opposite case, HTTP indicating a failure but the payload stating that the request was processed correctly? Merely including header information such as an HTTP status code verbatim in the payload does not provide any abstraction of the underlying protocol. Additional effort is required to map this information to a platform-independent form that is meaningful on the application level. For instance, a “404” code will be understandable for all Web developers but does not mean anything to Jakarta Messaging (formerly called JMS) experts. A textual message “service endpoint unavailable,” however, makes sense both for HTTP resources and for message queue usage. Also note that the underlying transport protocol might rely on the presence of some headers. Including such header information in the payload and thus transporting it twice again leads to redundancy and increased message size. This may harm performance and can lead to inconsistencies. If possible, such duplication should be avoided.

Regarding programmer productivity, it is not clear whether programmers are more productive (in the short and the long term) when delegating context information to the protocol or when implementing a CONTEXT REPRESENTATION themselves. Most of the effort lies in gathering the required information and putting it somewhere (on the sender side), then locating and processing it (on the receiver side). Assuming that the protocol libraries provide a proper local API, the development effort can be expected not to differ much. Some protocols may not support all required QoS headers; in that case, developers have to implement these features in the API if they cannot select protocols that do.

Separation of concerns and cohesiveness (assembling all context information in one place) can be conflicting forces; the related design decisions should be driven by answers to the following questions: Who produces and who consumes the context information, and when does this happen? How often will the data definitions change over time? How large is the data? What are its protection needs?

Related Patterns

This pattern is often combined with other patterns; for instance, the data requests expressed in a WISH LIST can be part of CONTEXT REPRESENTATIONS (but do not necessarily have to be). Similarly, an ERROR REPORT can find its place in response message contexts. REQUEST BUNDLES might require two types of CONTEXT REPRESENTATION, one on the container level and one for each individual request or response element. For instance, both individual ERROR REPORTS and an aggregated bundle-level report might make sense when one or more individual responses in a REQUEST BUNDLES fail. A VERSION IDENTIFIER can be transported in the CONTEXT REPRESENTATION as well.

While the “Front Door” pattern [Schumacher 2006] is applied frequently to introduce reverse proxies, API providers and clients might not want all headers to go through the security procedures provided by such proxies; CONTEXT REPRESENTATION can be applied in such cases. An “API Gateway” [Richardson 2016] or proxy could act as an intermediary and modify the original request(s) and response(s), but this makes the overall architecture more complex and more challenging to manage and evolve. While this approach might be convenient, it also means giving up control (or having less control, but an extra dependency).

A similar pattern appears in several other pattern languages. For instance, the “Context Object” [Alur 2013] solves the problem of protocol-independent storage of state and system information in a Java programming context (rather than in a remoting context). The “Invocation Context” pattern [Voelter 2004] describes a solution for bundling contextual information in an extensible invocation context of a distributed invocation.

An Invocation Context is transferred between a client and remote object with every remote invocation. The “Envelope Wrapper” pattern [Hohpe 2003] solves a similar problem, making certain parts of a message visible to the messaging infrastructure responsible for a particular leg. Systems management patterns such as “Wire Tap” [Hohpe 2003] can be used to implement the required auditing and logging.

More Information

Chapter 3 in the RESTful Web Services Cookbook discusses an alternative approach based on entity headers (in the context of HTTP) in two of its recipes [Allamaraju 2010].

“On the Representation of Context” [Stalnaker 1996] gives an overview of context representation in linguistics.

The METADATA ELEMENT pattern provides more pointers to related patterns and other background information.

Summary

This chapter investigated the structure and meaning of representation elements in request and response messages. Element stereotypes distinguish data from metadata, identifiers, and links; some representation elements have special and common purposes.

We concentrated on the data contract as represented by DATA ELEMENTS. Most data that an API contract exposes comes from the API implementation (for example, instances of domain model entities). As data about data, METADATA ELEMENTS provide supplemental information such as origin traces, statistics, or usage hints. Another specialization of DATA ELEMENT is ID ELEMENT. ID ELEMENTS provide the glue code necessary to address, distinguish, and interconnect API parts (such as endpoints, operations, or representation elements). ID ELEMENTS do not contain network-accessible addresses and typically do not contain semantic type information; if this information is required, the LINK ELEMENT pattern is eligible. All types of DATA ELEMENTS might come as ATOMIC PARAMETERS but can also be grouped as ATOMIC PARAMETER LISTS or assembled within PARAMETER TREES. Read and write access to INFORMATION HOLDER RESOURCES endpoints naturally requires DATA ELEMENTS; the in and out parameters of PROCESSING RESOURCES do so as well. METADATA ELEMENTS might explain the semantics of these resources or ease their usage on the client side. All these structural considerations and DATA ELEMENT properties should be defined in the API contract and explained in the API DESCRIPTION.

We also covered three special-purpose representation elements. API KEYS can be used whenever clients must be identified, for example, to enforce a RATE LIMIT or PRICING PLAN (see Chapter 8, “Evolve APIs”). A CONTEXT REPRESENTATION contains and bundles multiple METADATA ELEMENTS AND/OR ID ELEMENTS for the particular purpose of sharing context information via the payload. An ERROR REPORT can find its place in a CONTEXT REPRESENTATION, for instance, when reporting errors caused by a REQUEST BUNDLE (because the required summary-details structure is difficult to model in protocol-level headers or status codes). The REQUEST BUNDLE pattern is covered in Chapter 7.

Many complements and alternatives to API KEYS exist, as security is a challenging, multifaceted topic. For instance, OAuth 2.0 [Hardt 2012] is an industry-standard protocol for authorization that is also the foundation for secure authentication through OpenID Connect [OpenID 2021]. For FRONTEND INTEGRATION, a common choice is JWT, as defined by RFC 7519 [Jones 2015]. JWT defines a simple message format for access tokens. The access tokens are created and cryptographically signed by the API provider. Providers can verify the authenticity of such a token and use it to identify clients. Unlike API KEYS, JWTs may contain a payload, according to the specification. The provider can store additional information in this payload for the client to read that an attacker cannot change without breaking the signature.

Another example of a full-fledged authentication or authorization protocol is Kerberos [Neuman 2005], which is often used inside a network to provide single sign-on (authentication). In combination with LDAP [Sermersheim 2006], it can also provide authorization. LDAP itself also offers authentication features, so LDAP can be used as authentication and/or authorization protocol. Examples of point-to-point authentication protocols are CHAP [Simpson 1996] and EAP [Vollbrecht 2004]. SAML [OASIS 2005] is another alternative, which can, for instance, be used in BACKEND INTEGRATION to secure the communication between the APIs of backend systems. These alternatives offer better security but also come with a much higher implementation and runtime complexity.

Advanced API Security [Siriwardena 2014] provides a comprehensive discussion on securing APIs with OAuth 2.0, OpenID Connect, JWS, and JWE. Chapter 9 of Build APIs You Won’t Hate [Sturgeon 2016b] discusses conceptual and technology alternatives and provides instructions on how to implement an OAuth 2.0 server. The OpenID Connect [OpenID 2021] specification deals with user identification on top of the OAuth 2.0 protocol. Chapter 15 in Principles of Web API Design [Higginbotham 2021] discusses ways to protect APIs.

All patterns in this chapter work with any textual message exchange format and exchange pattern. Our examples use the request-response message exchange pattern due to its widespread usage; the patterns are written in such a way that they are also eligible when choosing another message exchange pattern. While being particularly relevant when designing services-based systems, none of the presented patterns assumed any particular integration style or technology.

Next up, in Chapter 7, is advanced message structure design, targeting ways to improve certain qualities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.54.13