The problem is essentially the one discussed by science fiction writers: “how do you get communications started among totally uncorrelated ‘sapient’ beings?”
J.C.R. Licklider, 1966
A foundational element of any system-level design is a set of shared principles or understandings about how parts of the system interact. And that is the overarching problem we’ll address in this chapter — how to design systems where machines built by different people who have never met can sucessfully interact with each other.
For communications within the Web, HTTP is at the heart of this shared understanding. It is HTTP that sets the rules — and the expectations — for sending data between services. And, despite the fact that HTTP’s history dates back to the mid 1980s, it is still ubiquitous and relevant after more than 30 years.
It is important to remember that HTTP is the “higher-level” protocol agreement between machines on the network. Lower-level protocols like — TCP/IP 1, 2 and UDP 3 — provide the backbone for moving bits around. One of the things I find most interesting about this “low-level” communications system is that it work because the communication components do not understand the meaning of the data they ship around. Meaning is separated from the message.
The notion that information can be dealt with independent of the messages used to share that information is a key understanding that makes machine-to-machine communication possible “at scale.” And dealing with information within it’s own level is also important. HTTP offers a great example of this through the use of media types. I can send you company sales data in an HTML message. I can send you the same information in a CSV message; or as a plain text message, and so forth. The data is independent of the media type.
You’ll find this notion of separation throughout the recipes in this chapter. While protocol and media-type are well-established forms of separation, the recipes in this chapter also rely on an additional form — that of the separating vocabulary from the message format. Vocabulary is the set of rules we use to communication something important. And communication works best when we both share the same vocabulary. For example, you can imagine two animated machines talking to each other:
“Hey, Machine-1, let’s talk about health care systems using the HL7 FIHR Release 4 vocabulary 4.”
“OK, Machine-2. But can we please use RDF to send messages instead of XML?”
“Sure, Machine-1. As long as we can use HTTP and not MQTT.”
“Agreed!”
Notably, the FIHR platform recognizes that FIHR information can be shared using XML, JSON, and RDF message formats — a clear nod to the importance of separating domain-specific information from the formats used to share that information.
Another important shared understanding for scalable systems is agreement on ‘how we get things done’. In this case, the way things ‘get done’ is through hypermedia. Hypermedia controls, like links and forms, are used to express the run-time metadata needed to complete a task. The controls themselves are structured objects that both services and clients understand ahead of time. However, the contents of that control — the values of the metadata properties — are supplied at run-time. Things like the data properties needed for a particular action, the protocol method to use to execute the action and the target URL for that action. These are all decided at run-time. The client doesn’t need to memorize them, it receives them in the message. Basically, the hypermedia controls are another level of separation. In fact, the JSON-LD 5 message format relies on a separate vocabulary (Hydra 6) to express the hypermedia information within a JSON-LD message. You’ll find a few recipes here that acknowledge the importance of hypermedia for RWMs. You’ll also find several specific hypermedia recipes in the following chapters covering specific implementation patterns.
Finally, you’ll find some recipes that are specific to successful communication over HTTP. These recipes cover things like network promise (safety and idempotence) and the realities of supporting machine-to-machine communications over unreliable connections at long distances. Lucky for us, HTTP is well-suited for resolving broken connections — it was designed and built during a time when most HTTP conversations where conducted over relatively slow voice telephone lines. However, for HTTP to really work well, both clients and servers need to agree on the proper use of message metadata (e.g. HTTP headers) along with some added support for how to react with network-level errors occur.
So, this design chapter focuses on recipes that acknowledge the power of separate layers of communication, the role of hypermedia in communicating how things get done, and some shared understanding on how to ensure reliable communications when errors occur.
But before we dig into the recipes, let’s explore the role of hypermedia, information architecture, and network communication as a foundational elements in our design recipes.
There are three general ideas behind the design recipes in this chapter:
An agreed communication format to handle connections between networked machines
A model for interpreting data as information
A technique for telling machines, at run-time, just what actions are valid.
All the recipes in this book are devoted to the idea of making useful connections between application services running on networked machines. The most common way to do that today is through TCP/IP at the packet level and HTTP at the message level. There’s an interesting bit of history behind the way the United States defense department initially designed and funded the first machine-to-machine networks (ARPANET) which eventually became the Internet we use today. And it involves out space aliens. In the 1960s, as the U.S. was developing computer communications,it was the possibility of encountering aliens from outer space that drove some of the design choices for communicating between machines.
Along with agreed communication formats for inter-machine communications, the work of organizing and sharing data between machines is another theme in this chapter. To do this, we’ll dig a bit into information architecture (IA) and learn the value of ontologies, taxonomies, and choreography. The history of IA starts about the same time Roy Fielding was developing his REST software architecture style and was heavily influenced by the rise of Tim Berners-Lee’s World-Wide Web of HTTP and HTML. In this chapter, we’ll use IA as an organizing factor to guide how we describe service capabilities using a shared vocabulary pattern.
Finally, we’ll go directly to the heart of how machines built by different people who have never met each other can successfully interact in real-time on an open network — using “hypermedia as the engine of application state.” Using hypermedia to describe and enable executing actions on other machines in the network is what makes RESTful Web Microservices work. Reliable connections via HTTP and consistent modeling using vocabularies are the prerequisites for interaction and hypermedia is the technique that enables that interaction. The recipes in this chapter will identify ways to design hypermedia interactions while the following chapters will contain specifics on how to make those designs function consistently.
So, let’s see how the possibility of aliens from outer space, information architecture, and hypermedia converge to shape the design of RESTful Web Microservices.
In 1963, J.C.R. “Lick” Licklider, a little-known civilian working in the United States Defense Department penned an inter-office memo to his colleagues working in what was then called the “Advanced Research Projects Agency (ARPA). Within a few years, this group would be responsible for creating the ARPANET 7 — the forerunner of today’s Internet. However, at this early stage, Licklider’s audience was addressed as the “Members and Affiliates of the Intergalactic Network” 8. His memo was focused on how computing machines could be connected together — how they could communicate successfully with one another.
In the memo, Licklider calls out two general ways to ensure computers can work together. They could a) make sure all computers on the planet used the same languages and programming tools. Or, b) they could establish a shared network-level control language that allowed machines to use their own “preferred” local tooling and languages and then use another “shared” language when speaking on the network. The first option would make it easy for machines to connect, but difficult for them to specialize. The second option would allow computer designers to focus on optimizing local functionality but it would add complexity to the work of programming machines to connect with each other.
In the end (lucky for us!), Licklider and his team decided the approach that favored both preferred local machine languages and a separate, shared network-level language. This may seem obvious to us today but it was not clear at the time. And was not Licklider’s decision but his unique reasoning for it that stands out today — the possibility of encountering aliens from outer space. You see, while ARPA was working to bring the age of computing to life, another United States agency, NASA, was in a race with the Soviet Union to conquer outer space.
Here’s the quote from Licklider’s memo that brings the 1960s space race and the computing revolution together:
The problem is essentially the one discussed by science fiction writers: “how do you get communications started among totally uncorrelated ‘sapient’ beings?”
J.C.R. Licklider, 1966
Licklider was speculating on how our satellites (or our ground-based transmitters) might approach the problem of communicating with other intelligent beings from outer space. And, he reasoned, we’d accomplish that through a process of negotiated communications — passing control messages or “meta-messages” (messages about how we send messages) back and forth until we both understood the rules of the game. Ten years later, the TCP 9 and IP 10 protocols of the 1970s would mirror Licklider’s speculation and form the backbone of the Internet we enjoy today.
Today, here on earth, Licklider’s thought experiment on how to communicate with aliens is at the heart of making RESTful Web Microservices (RWM) a reality. As we work to design and implement services that communicate with each other on the web, we, too, need to adopt a “meta-message” approach. This is especially important when we consider that one of the aims of our work is to “get communications started among totally uncorrelated” services. In the spirit of our guiding principle (see Recipe 1.3), people should should be able to confidently design and build services that will be able to talk to other services built by other people they have never met whether the services were built yesterday, today, or in the future.
The recipes in this section are all aimed at making it possible to implement “Licklider-level” services. You’ll see this especially in sections on design negotiation, and design bending.
The 1990s was a heady time for proponents of the Internet. Tim Berners-Lee’s World Wide Web and HTTP/HTML (see “The Web of Tim Berners-Lee”) was up and running, Roy Fielding was defining his REST architecture style (see “Fielding’s REST”) and Richard Saul Wurman 14 was coining a new term: “Information Architect”. In his 1996 book “Information Architects” 15 Wurman offers this definition:
Information Architect: 1) the individual who organizes the patterns inherent in data, making the complex clear; 2) a person who creates the structure or map of information which allows others to find their personal paths to knowledge; 3) the emerging 21st century professional occupation addressing the needs of the age focused upon clarity, human understanding and the science of the organization of information.
Richard Saul Wurman, 1996
A physical architect by training, Wurman founded the Technology, Entertainment, and Design (TED) conferences in 1984. A prolific writer, he has penned almost 100 books on all sorts of topics including art, travel, and (important for our focus) information design. One of the people who picked up on Wurman’s notion of architecting information was library scientist Peter Morville. Considered one of the “founding fathers” of the information architecture movement, Morville has authored several books on the subject. His best known, first released in 1998, is titled simply “Information Architecture” 16 and is currently in it’s fourth edition.
Morville’s book focuses on how humans interact with information and how to design and build large-scale information systems to best support continued growth, management, and ease of use. He points out that system with a good information architecture (IA) helps users of that system to understand 1) where they are, 2) what they’ve found, 3) what else is around them, and 4) work to expect. These are all properties we need for our RWM systems, too. What we’ll be doing here is using recipes that accomplish these same goals for machine-to-machine interactions.
One of the ways in which we’ll be organizing the information architecture of RWM implementations is through the use of a three-part modeling approach: 1) ontology, 2) taxonomy, and 3) choreography (see “The Power of Vocabularies”). Several recipes in this chapter are devoted to information architecture including Recipe 2.3, Recipe 2.4, and Recipe 2.5.
One of the reasons I started this collection of recipes with the topic of “design” is that the act of designing your information system establishes some rules from the very start. Just as the guiding principles (see Recipe 1.3) we discussed in the previous chapter establish a foundation for making decisions about information systems, design recipes make that foundation a reality. It is this first set of recipes which affect, in many ways govern, all the recipes in the rest of the book.
In this way, setting out these first recipes is a kind of "a priori design” approach. One of the definitions of a priori from the Merriam-Webster dictionary is “formed or conceived beforehand” and that is what we are doing here. We are setting out elements of our systems beforehand. And there is an advantage to adopting this a priori design approach. It allows us to define stable elements of the system upon which we can build the services and implement their interaction.
Creating a design approach means we need a model that works for more than a single solution. For example, an approach that only works for content management systems (CMSs) but not for customer relationship management systems (CRMs) is not a very useful design approach. We intuitively know that these two very different solutions share quite a bit in common (both at the design and the technical solution level) but it often takes some work to tease out those similarities into a coherent set — a set of design principles.
This can be especially challenging when we want to create solutions that can change over time. Ones that remain stable while new features are added, new technology solutions are implemented, and additional resources like servers and client apps are created to interact with the system over time. What we need is a foundational design element that provides stability while supporting change.
In this set of recipes, that foundational element is the use of hypermedia, or links and forms, (see Recipe 1.2) as the device for enabling communications between services. Fielding called hypermedia “the engine of application state” 17. Hypermedia provides that meta-messaging Licklider identified (see “Licklider’s Aliens”). And it is the use of hypermedia that enables Kay’s “extreme late binding” (see “Alan Kay’s Extreme Late Binding”).
You’ll find most of the recipes in this book lean heavily on hypermedia in order to enable both connection and general communication between machines. In this chapter, Recipe 2.5, Recipe 2.1, and Recipe 2.6 are specifically aimed at establishing hypermedia as a key element in this a priori design. Another key element in supporting this design approach is the use of the three pillars of information architecture (see “Morville’s Information Architecture”).
So, how do we use design recipes to create hypermedia-powered shared “a priori” understanding based on three-part information architecture that lives up to Licklider’s vision? Let’s look at the recipes below for examples.
A key to establishing a stable, reliable system is to ensure long-term interoperability between services. That means services created years in the past are able to successfully exchange messages with services created years in the future.
How do you design services that have a high degree of interoperability well into the future?
The best way to ensure a high level of long term interoperability between services is to establish stable rules for exchanging information. On the web, the best way to do that is to select and document support for one or more open source media type formats for data exchange. A good source for long-term, stable media types is the Internet Assigned Numbers Authority a.k.a. IANA. Viable candidate media types for long-term support of RWMs are unstructured media types like XML and JSON as well as structured media types such as HTML, Collection+JSON, UBER, HAL, and Siren. See Appendix C for a list of viable media types at the time of this book’s release.
See Recipe 2.2 for more on the difference between structured and unstructured media types.
When you create services, you should document which registered media types (RMTs) your service can support. It is recommended that your service support more than one RMT and that you allow service consumers to both discover which RMTs your service supports and how they can indicate their preference when exchanging messages (see design negotiation).
When I release a service, I almost always add HTML as one of the designated message formats. It has been around for 30+ years, you can use any common browser as a client for the service (great for testing), and there are enough parsers and other HTML tooling to make it relatively easy for consumers of your service to be able to exchange data with you.
Additional recommendations for candidate media types are 1) they should support hypermedia within the message (see Recipe 2.5) and, 2) they should support custom extensions (e.g. your ability to safely add new features to the format without breaking any existing applications). This last point will come in handy when you are stuck supporting a format that has fallen out of favor and you need to make some modifications to extend the life of your service.
It is also possible to create your own media type format for your services. That can work if a) your universe of service API consumers is relatively limited (e.g. within your own company), b) your universe of service API consumers is crazy large (e.g. Google, Facebook, Amazon, etc.), or c) your services are the leader in a vertical (e.g. document management, financial services, health care, etc.). If your efforts do not fall into one of these categories, I recommend you DO NOT author your own media type.
design negotiation
An important foundational element in the design of RWM is supporting future compatibility. This means making it possible for services written in the past to continue to interact with services well into the future. It also means designing interactions so that it is unlikely that future changes to already-deployed services will break other existing services or clients.
How do you design machine-to-machine interactions that can support modifications to in-production services that do not break existing service consumers?
To support non-breaking changes to existing services you should use structured media types (SMTs) to pass information back and forth. SMTs make it possible to emit a stable, non-breaking message even when the contents of that message (e.g. the properties of a record, list of actions, etc.) has changed. The key is to design interactions with the message shared between machines maintains the same structure even when the data conveyed by that message changes.
A structured media type SMT provides a strongly-typed format that does not change based on the data being expressed. This is in opposition to unstructured message formats likeXML and JSON. See the example for details.
It is a good idea to use well-known media types in your interactions. Media types registered through the Internet Assigned Numbers Authority (IANA) are good candidates. For example, HTML is a good example of an SMT. Other viable general use SMT formats are Collection+JSON and UBER. See Appendix C for a longer list of media types to consider.
A structured media type SMT provides a strongly-typed format that does not change based on the data being expressed. Here’s an example of a simple message expressed as HTML
<ul
name=
"Person"
>
<li
name=
"givenName"
>
Marti</li>
<li
name=
"familyName"
>
Contardi</li>
</ul>
...<ul
name=
"Person"
>
<li
name=
"givenName"
>
Marti</li>
<li
name=
"familyName"
>
Contardi</li>
<li
name=
"emailAddress"
>
[email protected]</li>
</ul>
The structure of the above message can be expressed (in an overly simplified way) as “One or more li
elements enclosed by a single ul
element.” Note that adding more content (for example, an email address) does not change the structure of the message (you could use the same message validator), just the contents.
Contrast the above example with the following JSON objects:
{
"Person"
:
{
"givenName"
:
"Marti"
,
"familyName"
:
"Contardi"
}
}
...
{
"Person"
:
{
"givenName"
:
"Marti"
,
"familyName"
:
"Contardi"
,
"emailAddress"
:
"[email protected]"
,
}
}
The JSON structure can be expressed as “A JSON object with two keys (givenName
and familyName
). Adding an email address to the message would result in a change in the structure of the JSON message (e.g. requires a new JSON Schema document) as well as a change in content.
The important element in this recipe is to keep the content of the message loosely-coupled from the structure. That allows message consumer applications to more easily and consistently validate incoming messages — even when the contents of those messages changes over time.
Another way to think about this solution is to assume that the first task of message consumer applications is to make sure the message is well-formed — that it complies with the basic rules of how a message must be constructed. It is, however, a different task to make sure the message is valid — that the message content follows established rules for what information needs to appear within the messages (e.g. “all Person
messages MUST include a givenName
, familyName
, and emailAddress
property”).
To ensure future compatibility, the first step is to make sure negotiation messages can remain well-formed even when the rules for what constitutes a valid messages changes over time.
design negotiation
Just like you need to support registered, structured media types in order to ensure the long-term compatibility and reliability of your service, you also need to support stable consistent domain-specifics in order to ensure long-term viability for both service producers and consumers.
How can you make sure your service’s data property names will be understood by other services — even services that you did not create?
To ensure your the data properties your service uses to exchange information are well-understood by a wide range of others services (even ones you did not create) you should employ well-documented, widely-known data property names as part of your external interface (API). These names should be ones already defined in published data vocabularies or, if they are not already published, you should publish your data vocabulary so that others creating similar services can find and use the same terms.
A good source of general-use published vocabularies for the web is the Schema.org web site. It has hundreds of well-defined terms that apply to a wide set of use cases and it continues to grow in a well-governed way. There are other well-governed vocabulary sources like Microformats.org and Dublin Core.
A word of caution. Some vocabularies, especially industry-specific ones, are not fully open source (e.g. you must pay for access and participation). I will also point out that some vocabulary initiatives, even some open source ones, aim for more than a simple shared vocabulary. They include architectural, platform, and even SDK recommendations and/or constraints.
Your best bet for creating long-term support for interoperability is to make sure any vocabulary terms you use are disconnected from any other software or hardware dependencies.
As of this writing there are some industry-specific public vocabularies to consider, too. Some examples are: PSD2 for payments, FIHR for health care, and ACORD for insurance.
When you release your service, you should also release a document that lists all the vocabulary terms consumed or emitted by your service along with links to proper definitions. See Recipe 2.4 for more on how to properly document and publish your vocabulary documents.
Services with a well-managed vocabulary are more likely to be understood by others and, therefore more likely to gain wider adoption. When you have the option, you should use well-known terms in your service’s external API even when your own internal data storage system uses local or company-specific terms.
For example, here’s a Person record that reflect the terms used within a typical U.S. company:
{
"collection"
:
{
"links"
:
[
{
"rel"
:
"self"
,
"href"
:
"http://api.example.org/persons"
}
],
"items"
:
[
{
"href"
:
"http://api.example.org/persons/q1w2e3r4"
,
"data"
:
[
{
"name"
:
"fname"
,
"value"
:
"Dana"
},
{
"name"
:
"lname"
,
"value"
:
"Doe"
},
{
"name"
:
"ph"
,
"value"
:
"123-456-7890"
}
]
}
]
}
}
And here’s the same record that has been updated to reflect terms available in the schema.org vocabulary:
{
"collection"
:
{
"links"
:
[
{
"rel"
:
"self"
,
"href"
:
"http://api.example.org/persons"
},
{
"rel"
:
"profile"
,
"href"
:
"http://profiles.example.org/persons"
}
],
"items"
:
[
{
"href"
:
"http://api.example.org/persons/q1w2e3r4"
,
"data"
:
[
{
"name"
:
"givenName"
,
"value"
:
"Dana"
},
{
"name"
:
"familyName"
,
"value"
:
"Doe"
},
{
"name"
:
"telephone"
,
"value"
:
"123-456-7890"
}
]
}
]
}
}
Note the use of the profile
link relation in the second example. See Recipe 2.4 for details.
This recipe is based on the idea of making sure the data property terms are not tightly coupled to the message format. It is, essentially, the flip side of Recipe 2.2. Committing to well-known, loosely-coupled vocabularies is also an excellent way to protect your service from changes to any internal data models over time.
See Chapter 5 for more recipes on handling data for RWMs
Constraining your external interfaces to use only well-known, well-documented property names is one of the best things you can do to ensure the interoperability of your service. This, along with publishing semantic profiles (see Recipe 2.4) makes up the backbone of large-scale information system management. However, this work is not at all easy. It requires attention to detail, careful documentation, and persistent support. The team that manages and enforces a community’s vocabulary is doing important and valuable work.
One of the challenges of this recipe is that it is quite likely that the public vocabulary for your services is not the same as the private vocabulary. That private set of names is usually tied to ages-old internal practices, possibly a single team’s own design aesthetics, or even decisions dictated by commercial off the shelf (COTS) software purchased a long time ago. To solve this problem, services need to implement what is called an “anti-corruption layer” 18. There is no need to modify the existing data storage model or services.
It is important to document and share your service vocabulary. A good practice is to publish a list of all the “magic strings” (see “Richardson’s Magic Strings”) in a single place along with a short description and, whenever possible, a reference (in the form of a URL) to the source of the description. I use the Application-Level Profile Semantics (ALPS) format for this (see Recipe 2.4).
Below is an example ALPS vocabulary document:
{
"alps"
:
{
"descriptor"
:
[
{
"id"
:
"givenName"
,
"def"
:
"https://schema.org/givenName"
,
"title"
:
"Given name. In the U.S., the first name of a Person."
,
"tag"
:
"ontology"
},
{
"id"
:
"familyName"
,
"def"
:
"https://schema.org/givenName"
"title"
:
"Family name. In the U.S., the last name of a Person."
,
"tag"
:
"ontology"
},
{
"id"
:
"telephone"
,
"def"
:
"https://schema.org/telephone"
,
"title"
:
"The telephone number."
,
"tag"
:
"ontology"
},
{
"id"
:
"country"
,
"def"
:
"http://microformats.org/wiki/hcard#country-name"
,
"title"
:
"Name of the country associated with this person."
,
"tag"
:
"ontology"
}
]
}
}
When creating RWM vocabularies, it is a good practice to identify a single preferred source for definitions of terms along with one or more “backup” sources. For example, your vocabulary governance document guidance could read like the following:
“Whenever possible, use terms from Schema.org first. If no acceptable terms can be found, look next to Microformats.org and then to Dublin Core for possible terms. Finally, if you cannot find acceptable terms in any of those locations, create a new term in the company-wide vocabulary repository at [some-url].”
Sample Vocabulary Author Guidance
Some additional notes:
It is perfectly acceptable to mix vocabulary references in a single set of terms. You can see in the example above that I included three terms from Schema.org and one term from Microformats.org.
Just as in real life, there can be multiple terms that mean the same thing. For example tel
(from Microformats) and telephone
(from Schema.org). Whenever possible, limit the use of synonyms by adopting a single term for all external uses.
See Recipe 2.4 on how to publish this document and retrieve it on the web.
TK some recipes from CH05 (data)?
Along with decoupling the your well-defined data property vocabulary from your message structures, it is important to also put effort into defining how the data properties get passed back and forth. This is the work of “describing the problem space.” Problem spaces are used in games and interactive artwork as a way to place “guide-rails” on a set of related activities. Problem spaces are the “rules of the game”, so to speck. RWMs rely in “rules of the game” as well.
How can you provide a detailed description of all the possible data properties, objects, and actions supported by your services in a way that is usable both at design time and at run-time?
To make sure it is possible for developers to quickly and accurately understand the data exchanges and actions supported by your service you can publish a semantic profile document (SPD). SPDs contain a complete list of all the data properties, objects, and actions a service supports. Semantic profiles, however, are NOT API definition documents like OpenAPI, WSDL, AsyncAPI, and others. See Appendix C in the appendix for a longer list of API definition formats.
SPDs typically include all three elements of Information Architecture: 1) ontology, 2) taxonomy, and 3) choreography.
See “The Power of Vocabularies” for a discussion of the three pillars of Information Architecture.
Two common SPD formats are Dublin Core Application Profiles (DCAP) and Application-Level Profile Semantics (ALPS). Since I am a co-author on ALPS, all the examples, I’ll show here are using the ALPS format. See Appendix C in the appendix for a longer list.
Below is an example of a valid ALPS semantic profile document. Notice that all the elements of Morville’s information architecture (ontology, taxonomy, & choreography) are represented in this example.
{
"$schema"
:
"https://alps-io.github.io/schemas/alps.json"
,
"alps"
:
{
"title"
:
"Person Semantic Profile Document"
,
"doc"
:
{
"value"
:
"Simple SPD example for http://b.mamund.com/rwmbook[RWMBook]."
},
"descriptor"
:
[
{
"id"
:
"href"
,
"def"
:
"https://schema.org/url"
,
"tag"
:
"ontology"
},
{
"id"
:
"identifier"
,
"def"
:
"https://schema.org/identifier"
,
"tag"
:
"ontology"
},
{
"id"
:
"givenName"
,
"def"
:
"https://schema.org/givenName"
,
"tag"
:
"ontology"
},
{
"id"
:
"familyName"
,
"def"
:
"https://schema.org/familyName"
,
"tag"
:
"ontology"
},
{
"id"
:
"telephone"
,
"def"
:
"https://schema.org/telephone"
,
"tag"
:
"ontology"
},
{
"id"
:
"Person"
,
"tag"
:
"taxonomy"
,
"descriptor"
:
[
{
"href"
:
"#href"
},
{
"href"
:
"#identifier"
},
{
"href"
:
"#givenName"
},
{
"href"
:
"#familyName"
},
{
"href"
:
"#telephone"
}
]
},
{
"id"
:
"Home"
,
"tag"
:
"taxonomy"
,
"descriptor"
:
[
{
"href"
:
"#goList"
},
{
"href"
:
"#goHome"
}
]
},
{
"id"
:
"List"
,
"tag"
:
"taxonomy"
,
"descriptor"
:
[
{
"href"
:
"#Person"
},
{
"href"
:
"#goFilter"
},
{
"href"
:
"#goItem"
},
{
"href"
:
"#doCreate"
},
{
"href"
:
"#goList"
},
{
"href"
:
"#goHome"
}
]
},
{
"id"
:
"Item"
,
"tag"
:
"taxonomy"
,
"descriptor"
:
[
{
"href"
:
"#Person"
},
{
"href"
:
"#goFilter"
},
{
"href"
:
"#goItem"
},
{
"href"
:
"#doUpdate"
},
{
"href"
:
"#doRemove"
},
{
"href"
:
"#goList"
},
{
"href"
:
"#goHome"
}
]
},
{
"id"
:
"goHome"
,
"type"
:
"safe"
,
"tag"
:
"choreography"
,
"rt"
:
"#Home"
},
{
"id"
:
"goList"
,
"type"
:
"safe"
,
"tag"
:
"choreography"
,
"rt"
:
"#List"
},
{
"id"
:
"goFilter"
,
"type"
:
"safe"
,
"tag"
:
"choreography"
,
"rt"
:
"#List"
},
{
"id"
:
"goItem"
,
"type"
:
"safe"
,
"tag"
:
"choreography"
,
"rt"
:
"#Item"
},
{
"id"
:
"doCreate"
,
"type"
:
"unsafe"
,
"tag"
:
"choreography"
,
"rt"
:
"#Item"
},
{
"id"
:
"doUpdate"
,
"type"
:
"idempotent"
,
"tag"
:
"choreography"
,
"rt"
:
"#Item"
},
{
"id"
:
"doRemove"
,
"type"
:
"idempotent"
,
"tag"
:
"choreography"
,
"rt"
:
"#Item"
}
]
}
}
Here’s a workflow diagram (the choreography) for the person-alps.json
file shown above.
You can find the complete ALPS rendering (ALPS file, diagram, and documentation) of this semantic profile in the book’s associated github repository (TK).
Adopting semantic profile documents (SPDs) as an important part of creating usable RWMs. However, this approach is relatively new and is not yet widely used. Even though ALPS and DCAP have been around for about ten years, there are not a lot of tools and only limited written guidance on semantic profiles. Still, my own experience with ALPS leads me to think you’ll be seeing more on semantic profiles (even if it is in some future form other than ALPS and DCAP).
An SPD is a kind of machine-readable interface documentation. However, semantic profiles are not the same thing as API definition files like WSDL, OpenAPI, AsyncAPI and others. SPDs are designed to communicate general elements of the interface (base-level properties, aggregate objects, and actions). SPDs do not contain implementation-level details such as MQTT topics, HTTP resources, protocol methods, return codes, etc. These details are left to those tasked with writing the actual code that runs behind the interface described by semantic profiles.
Check out the Appendix D for example tools and services you can use to support semantic profiles.
Some other considerations when using semantic profiles:
The value of a single semantic profile increases with the number of services using that profile. That means you should create profiles that are easily used by others (see Appendix A). It’s a good practice to keep semantic profiles more general than specific. The more specific you make your profile, the smaller the audience for that profile.
Semantic profiles are descriptions, not definitions. Many of the details you need to include in an interface definition do not belong in a semantic profiles. For example, don’t include URLs or URL patterns in semantic profiles. Instead put then in the API definition file (e.g. OpenAPI, etc.).
It is a good practice to tag your semantic profile elements to indicate which ones are describing the ontology, which are taxonomy, and which are choreography (see “Example”).
It is a good idea to return the URI of your semantic profile with each protocol response. See Chapter 3 and Chapter 4 for details on how to do this.
Since clients and or services may create a logical dependency on your published semantic profile, it is a good practice to NOT make any breaking changes to the SPD once it is released. If you need to make changes, it is better to create a new profile document at a new URI (e.g. api.example.org/profiles/personV2
) and to leave the existing profile (the one without the changes) online, too.
To make it easy for people (and machines) find your semantic profiles, create a central online location where they can be easily found. This might be a set of actual profile documents or it might be a page that has pointers (URLs) to other places where each profile document is kept. This second option is good for cases where you are not the profile author and you still want to have some control over which profiles are commonly used.
See Chapter 2, Media Types
See Chapter 2, Structured Types
See Chapter 2, Hypermedia
See Chapter 3, Profiles
See Chapter 3, Hypermedia
See Chapter 3, Representors
TK Chapter 4 recipes
TK Chapter 7 recipes
Using embedded hypermedia within responses in order to inform API consumer applications what actions are currently possible is a foundational element in RWMs. As mentioned in “Alan Kay’s Extreme Late Binding”, hypermedia controls make it possible to implement “extreme late binding" — a kind of super loose coupling and that means you can more easily modify services in the future, too.
How can you extend the lifetime of services by building systems that support safe, non-breaking workflow changes as well as support short-term modifiability of existing services by customizing data exchanges at run-time?
The best way to support both short- and long-term modifiability for product services is to rely on inline hypermedia affordances to express the details of context-dependent data exchanges at run-time. That means you need to adopt message exchange formats that support embedded hypermedia (see Recipe 2.1 and Recipe 2.2).
HTML offers a great example of embedded hypermedia controls to express data exchanges at run-time. Here’s an example:
<html>
<head>
<title>
Create Person</title>
<link
rel=
"profile"
href=
"http://api.example.org/profiles/person"
/>
<style>
input
{
display
:
block
;}
</style>
</head>
<body>
<h1>
Create Person</h1>
<form
name=
"doCreate"
action=
"http://api.example.org/person/"
method=
"post"
enctype=
"application/x-www-form-urlencoded"
>
<fieldset>
<hidden
name=
"identifier"
value=
"q1w2e3r4"
/>
<input
name=
"givenName"
placeholder=
"givenName"
required
/>
<input
name=
"familyName"
placeholder=
"familyName"
required
/>
<input
name=
"telephone"
placeholder=
"telephone"
pattern=
"[0-9]{10}"
/>
<input
type=
"submit"
/>
<input
type=
"reset"
/>
<input
type=
"button"
value=
"Cancel"
/>
</fieldset>
</form>
</body>
</html>
The form above indicates one default input (identifier
) and three additional inputs, two of which are required (givenName
, familyName
) and one which MUST pass the pattern
validator (telephone
). The form also indicates three possible actions (submit
, reset
, and cancel
) along with details on the URL, HTTP method, and body encoding metadata for the submit
action. Even better, any HTML-compliant client (e.g. a Web browser) can support all these actions without the need for any custom programming code.
Here’s a simple example in Collection+JSON
{
"collection"
:
{
"version"
:
"1.0"
,
"href"
:
"http://api.example.org/person/"
,
"links"
:
[
{
"rel"
:
"self"
,
"href"
:
"http://api.xample.org/person/doCreate"
},
{
"rel"
:
"reset"
,
"href"
:
"http://api.example.org/person/doCreate?reset"
},
{
"rel"
:
"cancel"
,
"href"
:
"http://api.example.org./person"
}
],
"template"
:
{
"data"
:
[
{
"name"
:
"identifer"
,
"value"
:
"q1w2e3r4"
},
{
"name"
:
"givenName"
,
"value"
:
""
,
"required"
:
true
},
{
"name"
:
"familyName"
,
"value"
:
""
,
"required"
:
true
},
{
"name"
:
"telephone"
,
"value"
:
""
,
"regex"
:
"[0-9]{10}"
}
]
}
}
}
Again, a Cj-compliant client application would be able to enforce all the input rules described in the above hypermedia response without the need for any added programming.
And, in both cases, the meaning of the values of the input elements and metadata properties do not need to be understood by the API consumer — they just need to be enforced. That means the number of inputs could change over time, the destination URL of the HTTP POST can change, and even some of the rules can change and the API consumer application can still reliably enforce the constraints. For example, the validation rule for the telephone
value can change (e.g. it might be context dependent based on where the application is running).
Adopting embedded hypermedia messages is probably the most important step toward creating RESTful Web Microservices. There are a number of acceptable structured media types (see Appendix C) to choose from and supporting them with a parsing library is a one-type expense that pays off every time you use the library.
For an in-depth look at hypermedia client applications, see my book “RESTful Web Clients” (2017).
While embedded hypermedia is valuable, using it does come at some costs. First, it is a design-time constraint. Both API consumer and producers need to agree to use them. This is “the price of entry” when creating RWMs. While this is not different than “you must use HTML, CSS, and Javascript in responses to web browsers”, I still find many architects and developers who chafe at the idea of using hypermedia-rich responses. When you decide to build RWMs, you may need to help some people past this hurdle.
Selecting “the right” structured hypermedia media type can be turned into a battle pretty easily. There are multiple ones to pick from and some people fall into the trap of “you can only pick one of them.” In truth, you can use features designed into HTTP (see design negotiation) to help you support multiple response formats and select the best option at run-time. This is thirty-year-old tech so there are lots of examples and supporting code bits to help you handle this. You can even decide on a single format to start (usually HTML) and then add more types over time without breaking any existing applications.
All hypermedia formats are not equal. For example, the Hypertext Application Language (HAL) format standardizes link
elements but does not standardize forms
or data bodies. The Structured Interface for Representing Entities (SIREN) format standardizes links
and forms
, but not data bodies. HTML, Collection+JSON, and Universal Basis for Exchanging Representations (UBER) standardize all three message elements. I’ve created some “hypermedia polyfill” formats to supplement SIREN (Profile Object Description) and HAL (HAL-FORMS) to help bridge the gaps, but these additions can complicate implementations. See Appendix C for more on these and other media types.
Expressing domain state using hypermedia types can also be a challenge at times. Developers need to be able to convert internal object and model rules into valid hypermedia controls (forms
, inputs
, links
, etc.). It is a kind of translation skill that must be acquired. There are some libraries that make this easier (see Appendix D) but you may also need to “roll your own solutions.”
The work of emitting and consuming hypermedia formats at run-time is a common task, too. See design negotiation for details on how to use HTTP to help with that. Also, writing API consumers that can navigate hypermedia responses takes some skill, too. Many of the recipes in Chapter 3 are devoted to this work.
Other things to consider when using embedded hypermedia formats:
You can modify the hypermedia details in a response based on the user context and/or server state. For example, a form might have five fields when an authenticated administrator logs in but only have three input fields for anonymous users.
When you use hypermedia controls to describe potential actions, you can engage in Alan Kay’s “extreme late binding”. Hypermedia clients are designed to look for identifiers in the message (“where is the update-customer
control?”) and to understand and act upon the metadata found in that hypermedia control. All of this can happen at run-time, not design- or build-time. That means that the metadata details of an action can be changed while the service is in production. For example, you can change the target URL of an action from a local endpoint within the current service to an external endpoint on another machine — all without breaking the client application.
Changes in multi-step processes or workflows can also be enabled with hypermedia. Service responses can return one or more steps to complete and expect the client application to act on each one. There might be three steps in the initial release (create account, add associated company, add a new contact). Later, these three steps might be consolidated into two (create account , add associated company and contact). As long the client service consumer is following along with the supplied hypermedia instructions (instead of hard-wiring the steps into code), changes like this will not break the consumer application. See Chapter 7 for recipes on implementing hypermedia workflows.
See Chapter 2, Media Types
See Chapter 2, Structured Types
See Chapter 3, Media Types
See Chapter 3, Hypermedia
See Chapter 3, Clients Qualities
TK Chapter 4 recipes?
TK Chapter 7 recipes?
A particularly gnarly problem when working with external services in a machine-to-machine situation over HTTP is the “failed POST” challenge. Consider the case where a client application issues an HTTP POST to an account service to deduct 50 credits from an account and never gets a reply to that request. Did the request never arrive at the server? Did it arrive, get processed properly and the client never received the 200 OK response? The real question: should the client issue the request again?
There is a simple solution to the problem and it requires using HTTP PUT, not POST.
How do you design write actions over HTTP that remove the possibility of “double-posting” the data? How can you know whether it is safe to re-send a data-write the HTTP request if the client app never gets an HTTP response the first time?
All data-write actions should be sent using HTTP PUT, not HTTP POST. HTTP PUT actions can be easily engineered to prevent “double-posting” and ensure it is safe to retry the action when the server response never reaches the client application.
Influenced by the database pattern of CRUD (create, read, update, delete), for many years the “create” action has been mapped to HTTP POST and the “update” action has been mapped to HTTP PUT. However, writing data to to a server with HTTP POST is a challenge since the action is not consistently repeatable. To say it another way, the POST option is not idempotent. However, by design, the HTTP PUT operation is idempotent — it assure the same results even when you repeat the action multiple times.
Below is a simple example showing the additive nature of non-idempotent writes and the replacement nature of idempotent writes:
/**********************************
illustrating idempotency
RWMBook 2021
***********************************/
// non-idempotent
console
.
log
(
"non-idempotent"
);
var
t
=
10
;
for
(
x
=
0
;
x
<
2
;
x
++
)
{
t
=
t
+
1
;
//add
console
.
log
(
`try
${
x
}
: value
${
t
}
`
)
}
// idempotent
console
.
log
(
"idempotent"
);
var
t
=
10
;
for
(
x
=
0
;
x
<
2
;
x
++
)
{
t
=
11
;
// replace
console
.
log
(
`try
${
x
}
: value
${
t
}
`
)
}
The output should look like this:
mca@mamund-ws: node idempotent-example.js
non-idempotent
try 0 : value 11
try 1 : value 12
idempotent
try 0 : value 11
try 1 : value 11
Imagine the variable t
in the above example is any message body representing an HTTP resource. HTTP PUT uses the message body to replace any existing server resource with the message body. HTTP POST uses the message body to add a new server resource.
All this is fine until you issue write operation (HTTP POST or PUT) and never get a server response. You can’t be sure of what happened. Did the request never reach the server, did the server do the update and fail to send a response? Did something else happen (e.g. error that results in a partial write)?
To avoid this conundrum, the best solution is to always use HTTP PUT to write data to another machine on the network.
**** REQUEST PUT /person/q1w2e3 HTTP/2.0 Host: api.example.org Content-Type: application/x-www-form-urlencoded If-None-Match: * givenName=Mace&familyName=Morris **** RESPONSE HTTP/2.0 201 CREATED Content-Type: application/vnd.collection+json ETag: "p0o9i8u7y6t5r4e3w2q1" ...
Note that, while not common, it is proper HTTP to have a PUT request result in a 201 CREATED
response — if there is no resource at that address. But how do you tell the service that you are expecting to create a new resource instead of update an existing one? For that, you need to use the If-None-Match
. In the example above, the If-None-Match: *
header says “create a new resource at the supplied URL if there is no existing resource at this URL.”
If I wanted the service to treat the HTTP PUT as a replacement of an existing resource, I would use the following HTTP interaction:
**** REQUEST PUT /person/q1w2e3 HTTP/2.0 Host: api.example.org Content-Type: application/x-www-form-urlencoded If-Match: "p0o9i8u7y6t5r4e3w2q1" givenName=Mace&familyName=Morris **** RESPONSE HTTP/2.0 200 OK Content-Type: application/vnd.collection+json ETag: "o9i8u7y6t5r4e3w2q1p0" ...
Here, the HTTP request is marked with the If-Match:
header that contains the Entity Tag (or ETag) that identifies the exact version of the resource at /person/q1w2e3
you wish to update. If the ETag at that URL doesn’t match the one in the request (e.g. if the resource doesn’t exist or has been updated by someone else lately), then the HTTP PUT will be rejected with a HTTP 412 Precondition Failed
response instead.
This use of the “PUT-Create” pattern is a way to simplify the challenge of knowing when it is OK to re-try an unsuccessful write operation over HTTP. It works because, by design, PUT can be re-tried without creating unwanted side-effects. POST — by design — doesn’t make that promise. There are have been a handful of attempts to make POST retry-able. Two examples are Bill de hÓra’s HTTPLR 19 and Mark Nottingham’s POE 20. Neither gained wide adoption.
Using HTTP PUT to create new resources takes a bit more work up front — both for clients and servers. Primarily because it depends on proper use of the HTTP headers If-None-Match
, If-Match
, and ETag
. In my experience, it is best to “bake” this recipe into code for both the server and the client applications. I typically have “cover methods” in my local code called createResource(…)
and updateResource(…)
(or something similar) that know how to craft a proper HTTP PUT request (including the right headers) and how to respond when things don’t go as planned. The convenience of these cover methods lowers the perceived added cost of using “PUT-Create” and ensures it is implemented consistently across client and server.
See related recipes on creating and updating resources in Chapter 3 and Chapter 4.
Another recipe that is wrapped up in this one is the ability for client applications to supply resource identifiers. When using POST, the service is usually expected to create a new resource identifier — typically a monotonic ID like /person/1
, /person/2
, and so forth. When using PUT, clients are expected to supply the resource identifier (think “Please upload the file my-todo-list.doc
to my document storage”). See Chapter 3 and Chapter 4 for more on that.
Designing-in this support for idempotent writes with the “PUT-Create” pattern means both your client and server applications will be more reliable and consistent — even if the network connections you are using are not.
TK Chapter 3 ??
TK Chapter 4 ??
1 https://datatracker.ietf.org/doc/html/rfc793
2 https://datatracker.ietf.org/doc/html/rfc791
3 https://datatracker.ietf.org/doc/html/rfc768
4 http://hl7.org/fhir/R4/overview-dev.html
6 http://www.hydra-cg.com/spec/latest/core/
7 https://en.wikipedia.org/wiki/ARPANET
8 http://www.chick.net/wizards/memo.html
9 https://www.rfc-editor.org/rfc/rfc793.html
10 https://datatracker.ietf.org/doc/html/rfc791
11 https://www.rfc-editor.org/rfc/rfc5325.html
12 https://www.rfc-editor.org/rfc/rfc5326.html
13 https://www.rfc-editor.org/rfc/rfc5327.html
14 https://en.wikipedia.org/wiki/Richard_Saul_Wurman
15 https://www.worldcat.org/title/information-architects/oclc/963488172
16 https://www.worldcat.org/title/information-architecture-for-the-world-wide-web/oclc/1024895717
17 https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_5
18 https://malotor.medium.com/anticorruption-layer-a-effective-shield-caa4d5ba548c
19 http://xml.coverpages.org/draft-httplr-01.html
20 https://datatracker.ietf.org/doc/html/draft-nottingham-http-poe-00
18.117.148.105