Chapter 3. The Representor Pattern

“But it is a pipe.”

“No, it’s not,” I said. “It’s a drawing of a pipe. Get it? All representations of a thing are inherently abstract. It’s very clever.”

John Green, The Fault in Our Stars

Almost every team that starts out on the journey of implementing APIs for the Web runs up against the decision of which output format(s) to support. Most often, this decision is a matter of accepting the current norms rather than engaging in a series of experiments and research efforts. Usually, teams don’t have the time or energy to go through decades of material on software implementation and systems engineering in order to decide which output format the API will use. Instead, the current custom or fad is the one that wins the day.

And selecting a format is only part of the challenge. A more important consideration is just how to write services that implement output format support. Sometimes services are implemented in a way that tightly binds the internal object model to the external output format. That means changes in the internal model leak out into the output format and are likely to break client applications which consume the service API.

That leads to another important challenge you’ll face when dealing with messages passed between API consumer and API provider: protecting against breakage. Long ago, writers on software modularity offered clear advice on how to isolate parts of a system that are likely to change often and implement them in ways that made changes to that part of the system relatively cheap, safe, and easy. Keeping this in mind is essential for building healthy and robust API-based solutions.

So, there are a number of things to cover here. Let’s put aside the history of modularity for a bit and first address the challenge most API developers face: “Which message format should we use for our API?”

XML or JSON: Pick a Side!

So you want to implement an API, eh? Well, one of the first decisions you will face is which message format to use. Today, almost everyone decides on JSON—often with little to no discussion. That’s the power of current popularity—that the decision is made without much contemplation. But it turns out selecting JSON may not be the best choice or the only choice when it comes to your API output format.

All through the late 1990s and early 2000s, the common custom was to rely on message formats based on the XML standard. At that time, XML had a strong history—HTML and XML both had the same progenitor in SGML (ISO 8879:1986)—and there were lots of tools and libraries geared to parsing and manipulating XML documents. Both the SOAP specification and much of what would later be known as the SOA (service-oriented architecture) style started as XML-based efforts for business computing.

XMLHttpRequest

One of the most important API-centric additions to the common web browser—the ability to make direct calls to services within a single web page—was called the XMLHttpRequest object because it was assumed that these browser-initiated inline requests would be returning XML documents. And they did in the beginning. But by the mid-2000s, the JavaScript Object Notation (JSON) format would overtake XML as the common way to transfer data between services and web browsers. The format has changed, but the JavaScript object name never has.

But, as we all know, selecting XML for passing data between services and clients did not end the format debate. Even while the XML-based SOAP document model was being published as a W3C Note in May 2000, there was another effort underway to standardize data-passing documents—the JavaScript Object Notation format, or JSON.

Douglas Crockford is credited with specifying JSON in early 2001. Even though the JSON RFC document (RFC627) was not published until 2006, the format had experienced wide use by that time and was gaining in popularity. As of this writing, JSON is considered the default format for any new API. Recent informal polls and surveys indicate few APIs today are being published using the XML output format and—at least for now—there is no new format likely to undermine JSON’s current popularity.

“I did not invent JSON”

In 2011, Douglas Crockford gave a talk he dubbed “The True Story of JSON” in which he said “I do not claim to have invented JSON. …What I did was I found it, I named it, I described how it was useful. …So, the idea’s been around there for a while. What I did was I gave it a specification, and a little website.” He even states that he saw an early example of JSON-like data-passing as early as 1996 from the team that was working on the Netscape web browser.

Of course, XML and JSON are not the only formats to consider. For example, another valuable format for passing data between parties is the comma-separated values (CSV) format. It was first standardized by the IETF in 2005 (RFC4180) but dates back to the late 1960s as a common interchange format for computers. There are likely going to be cases where an API will need to output CSV, too. For example, almost all spreadsheet software can easily consume CSV documents and place them in columns and rows with a high degree of fidelity.

And there have also been several binary formats created over the years, such as XML-based Fast Infoset from 2007, Google’s Protobuf created in 2008, and more recent binary formats such as Apache Avro (2016) and Thrift from Facebook, which also defines extensive RPC protocol details.

Clearly the problem is not just deciding between XML and JSON.

The New Crop of Hypermedia Formats

Starting in the early 2010s, a new crop of text-based formats emerged that offered more than just structure for describing data; they included instructions on how to manipulate the data as well. These are formats I refer to as hypermedia formats. These formats represent another trend in APIs and, as we will see later in the book, can be a valuable tool in creating API-based services that support a wide range of service changes without breaking existing client applications. In some cases, they even allow client applications to “auto-magically” acquire new features and behaviors without the need for rewriting and redeploying client-side code.

Atom Syndication and Publishing

Although most of the new hypermedia formats appeared on the scene around 2010 and later, one format (the Atom Syndication Format) was standardized in 2005 as RFC4287. It has similar roots as the SOAP initiative and is an entirely XML-based specification. The Atom Format, along with the Atom Publishing Protocol (RFC5023) in 2007, describe a system of publishing and editing web resources that is based on the common Create-Read-Update-Delete (CRUD) model of simple object manipulation.

Atom documents are mostly used in read-only mode for news feeds and other simple record-style output. However, several blog engines support editing and publishing entries using Atom documents. There are also a number of registered format extensions to handle things like paging and archiving (RFC5005), threads (RFC4685), and licensing content (RFC4946). I don’t often see Atom used to support read/write APIs on the WWW but still see it used in enterprise cases for handling outputs from queues and other transaction-style APIs.

Atom is interesting because it is an XML-based format that was designed specifically to add read/write semantics to the format. In other words, like HTML, it describes rules for adding, editing, and deleting server data.

And, since the release of the Atom specifications, a handful of other formats have been published.

Other hypermedia formats

There was a rush of text-based hypermedia-style formats published and registered with the IANA starting around 2011. They all share a similar set of assumptions even though each has unique strengths and focuses on different challenges for API formats. I’ll cover some of these at length in the book and wanted to mention them here to provide a solid background for dealing with the challenge of selecting and supporting formats for APIs.

Hypermedia Application Language (HAL)

The HAL format was registered with the Internet Authority for Names and Addresses (IANA) in 2011 by Mike Kelly. Described as “a simple format that gives a consistent and easy way to hyperlink between resources,” HAL’s design focus is on standardizing the way links are described and shared within messages. HAL does not describe write semantics but does leverage the URI Templates specification (RFC6570) to describe query details inline. We’ll spend an entire chapter exploring (and using) this very popular hypermedia type.

Collection+JSON format (Cj)

I published the Collection+JSON hypermedia format the same year as Mike Kelly’s HAL (we had been sharing ideas back and forth for quite a while before that). Unlike HAL, Cj supports detailed descriptions of the common Create-Read-Update-Delete (CRUD) semantics inline along with a way to describe input metadata and errors. It is essentially a JSON-formatted fork of the Atom Publishing Protocol that is focused on common list-management use cases. We’ll spend time coding for Cj formats later in the book.

The Structured Interface for Representing Entities (Siren)

The Siren format was created by Kevin Swiber and registered at the IANA in 2012. Siren “is a hypermedia format for representing entities with their associated properties, children, and actions.” It has a very rich semantic model that supports a wide range of HTTP verbs and is currently used as the default format for the Zetta Internet of Things platform. We’ll get a chance to dig into Siren later in the book.

The Universal Basis for Exchanging Representations (UBER)

I released a working draft of the UBER format in 2014. Unlike the other hypermedia formats listed here, UBER does not have a strong message structure, but instead has just one element (called “data”) used for representing all types of content in a document. It also has both a JSON and XML variant. UBER has not yet been registered with the IANA and will not be covered in this book.

Other formats

There are a number of other interesting hypermedia-style formats that have recently appeared that won’t be covered in this book. They include Jorn Wildt’s Mason, the JSON API spec from Yehuda Katz, Cross-Platform Hypertext Language by Mike Stowe, and several others.

Currently none of these new formats are a clear leader in the market and that, I think, is a good thing. In my experience, it is not common that an important universally valuable message format appears “out of the blue” from a single author. It is more likely that many formats from several design teams will be created, published, and tested in real-world scenarios before any possible “winner” will emerge. And the eventual solution will likely take several years to evolve and take several twists and turns along the way.

So, even though many people have said to me “I wish someone would pick just one format so I would know what to use,” I don’t think that will happen any time soon. It may seem like a good thing that you don’t have a choice to make, but that’s rarely true in the long run.

We need to get used to the idea that there is no “one API format to rule them all.”

The Fallacy of The Right One

So, despite all the new hypermedia formats out there (and the continued use of XML in enterprises), with the current trend pointing toward JSON as the common output format, it would seem an easy decision, right? Any time you start implementing an API, just use JSON and you’re done. Unfortunately, that’s almost never the way it goes.

First, some industry verticals still rely on XML and SOAP-based formats. If you want to interact with them, you’ll need to support SOAP or some other custom XML-based formats. Examples might be partner APIs that you work with on a regular basis, government or other standards-led efforts that continue to focus on XML as their preferred format, and even third-party APIs that you use to solve important business goals.

Second, many companies invested heavily in XML-based APIs over the last decade and are unwilling to rewrite these APIs just to change the output format. Unless there is a clear advantage to changing the messages format (e.g., increased speed, new functionality, or some other business metric), these XML-based services are not likely to change any time soon.

Finally, some data storage systems are XML-native or default to outputting data as XML documents (e.g., dbXML, MarkLogic, etc). While some of these services may offer an option to output the data in a JSON-based format, many continue to focus on XML and the only clear way of converting this data to JSON is to move it to JSON-native data storage systems like MongoDB, CouchDB, and others.

So, deciding on a single format for your team’s service may not be feasible. And, as your point of view widens from a single team to multiple teams within your company, multiple products within an enterprise, on up to the entire WWW itself, getting everyone to agree to both produce and consume a single output format is not a reasonable goal.

As frustrating as this may be for team leaders and enterprise-level software architects, there is no single “Right One” when it comes to message formats. It may be possible to control a single team’s decision (either through consensus or fiat) but one’s ability to exert this control wanes as the scope of the community grows.

And that means the way forward is to rethink the problem, not work harder at implementing the same solution.

Reframing the Problem

One way to face a challenge that seems insurmountable is to apply the technique of reframing—to put the problem in a different light or from a new point of view. Instead of working harder to come up with a solution to the perceived problem, reframing encourages us to step outside the frame and change our perspective. Sometimes this allows us to recognize the scenario as a completely different problem—one that may have an easier or simpler solution.

Cognitive Reframing

The current use of the term reframing came from the cognitive therapy work of Aaron T. Beck in the 1960s. As he was counseling patients experiencing depression he hit upon the idea that patients could be taught to become aware of negative thoughts as they arose and to “examine and evaluate them,” even turning them into positive thoughts. Initially called cognitive reframing, now the term is used to describe any technique that helps us reflect on our thoughts and situation and take a new perspective.

In our case (the challenge of selecting a single format for your APIs), it can help to ask “Why do we need to decide on a single format?” or, to put it another way, “Why not support many formats for a single API?” Asking these questions gives us a chance to lay out some of the reasons for and against supporting multiple formats. In this way, we’ve side-stepped the challenge of picking one format. Now we’re focused on a new aspect of the same problem. Why not support multiple formats?

Why is supporting one format “better”?

The common pattern is to assume that selecting a single format is the preferred solution. To that end, there are some typical justifications for this point of view:

One format is easier

Usually people make the case for supporting a single format for API output because it is thought to be easier than supporting multiple formats. It may not be ideal to select just one format, but it is preferable to the cost of supporting more than one. And this is a valid consideration. Often we work with programming tools and libraries that make supporting multiple output formats costly in some way (additional programming time, testing difficulty, runtime support issues, etc.).

Multiple formats is anarchy

There are other times when making the case for supporting more than one format is perceived as making the case for supporting any format. In other words, once you open the door for one additional format, you MUST support any format that might be thought of at some point in the future.

The format you prefer is “bad”

Sometimes, even in cases where multiple formats might be possible, some start to offer value judgments for one or more of the suggested formats by saying they are (for any number of reasons) “bad” or in some other way insufficient and should not be included. This can turn into a “war of attrition” that can prompt leadership to just pick one and be done with the squabbling.

We can’t know what people will like in the future anyway

Another reason to argue for just one format is that selecting any group of formats is bound to result in not selecting one or more formats that, at some future point, will become very popular. If you can’t accurately predict which ones to pick for the future, it’s a waste of time to pick any of them.

You Aren’t Gonna Need It (YAGNI)

Finally, most API projects start with a single client in mind—often a client built by the same group of people building the service API. In these cases, it seems to make sense to invoke this famous maxim from the Extreme Programming (XP) community. Another version of this idea is to “Do the simplest thing that could possibly work.” And, in cases where your API is unlikely to gain much reach or will have a limited lifetime, this kind of shortcut can make sense. However, if you are creating an API that will last more than a few years (most of them do) or one that will be used in a multichannel (desktop, browser, smartphone, etc.) environment, creating a service that is tightly coupled to a single message format can be costly in the long run. And, when done correctly, you can still abide by the YAGNI rule while keeping your options open for supporting multiple formats in the future.

The list goes on with any number of variations. But the theme is usually the same: you won’t be sure to pick the right formats, so just pick a single one and avoid the costly mistakes of adding other formats no one will like or use in the future. The underlying assumptions in these arguments are also generally the same. They look something like this:

  • Supporting multiple formats is hard.

  • There is no way to safely add formats over time without disrupting production code.

  • Selecting the right formats today is required (and is impossible).

  • Supporting multiple formats is too costly when you don’t have guaranteed uptake.

And it turns out, these assumptions are not always true.

What would it take to support multiple formats?

One way of helping a team reframe the message format challenge is to cast it as a “What would it take…?” question. Essentially, you ask the team to describe a scenario under which the suggested alternative (in our case, supporting multiple output formats) would be a reasonable idea. And it turns out that the assumptions listed before are a great starting point for setting out a scenario under which supporting multiple formats for your API is reasonable.

For example, you might make the following statements:

  • Supporting multiple formats for the same API needs to be relatively easy to do.

  • We need to be able to safely add new format support over time without disrupting production code (or existing clients).

  • We need some kind of consistent criteria for selecting which formats to add both now and in the future.

  • Adding a format needs to be cheap enough so that even it if turns out to be little-used, it is not a big deal.

Even though most of the statements here are qualitative criteria (“relatively easy,” “cheap enough,” etc.) we can use the same patterns of judgment and evaluation on the format challenge that we do when resolving other implementation-related challenges such as “What is an API resource?”, “Which HTTP method should we use?”, and others we face every day.

Luckily, there is a set of well-tested and documented programming patterns that we can use as a test case for implementing multiple format support for our APIs. And they date back to some of the earliest work on software patterns in the 1980s.

The Representor Pattern

To explore what it would take to make supporting multiple output formats for APIs, we need to work on a couple things. For a start, we should try to make it (1) relatively easy initially, and (2) safe to add new formats after production release. I’ve found that the first task is to clearly separate the work of format support from the actual functionality of the API. Making sure that you can continue to design and implement the basic API functionality (e.g., managing users, editing content, processing purchases, etc.), without tightly binding the code to a single format will go a long way toward making multiple format support safe and easy—even after the initial release of your API into production.

The other challenge for this kind of work is to cast the process of converting internal domain data (e.g., data graphs and action details) into a message format and a consistent algorithm that works well for a wide range of formats. This will require some software pattern implementation as well as an ability to deal with a less than 100% fidelity between the domain model and the output model. We deal with this every day with HTML (HTML doesn’t know anything about objects or floating-point types) and we need to adopt a similar approach when dealing with common API message formats, too.

Finally, we’ll need a mechanism for selecting the proper format for each incoming request. This, too, should be an algorithm we can implement consistently. Preferably this will rely on existing information in HTTP requests and will not introduce some new custom metadata that clients will need to support.

OK—separate format processing, implement a consistent way to convert domain data into an output format, and identify request metadata to help us select the proper format. Let’s start with separating the format processing from the domain.

Separating Format from Functionality

All too often, I see service implementations that are bound too tightly to a single format. This is a common problem for SOAP implementations—usually because the developer tooling leads programmers into relying on a tight binding between the internal object model and the external output format. It is important to treat all formats (including SOAP XML output) as independent of the internal object model. This allows some changes in the internal model to happen without requiring changes to the external output format.

To manage this separation, we’ll need to employ some modularity to keep the work of converting the domain model into a message external from the work of manipulating the domain model itself. This is using modularity to split up the assignment of work. Typically modularity is used to collect related functionality in a single place (e.g., all the functionality related to users or customers or shoppingCarts). The notion of using modularity as primarily a work assignment tactic comes from David Parnas’s 1972 paper “On the Criteria to be Used in Decomposing Systems into Modules.” As Parnas states:

[M]odule is considered to be a responsibility assignment rather than a subprogram. The modularizations include the design decisions which must be made before the work on independent modules can begin. [Emphasis in the original]

Viewing the work of converting internal domain data into external output formats as a responsibility assignment leads us to isolate the conversion process into its own module. Now we can manage that module separately from the one(s) that manipulate domain data. A simple example of this clear separation might look like this:

var convert = new ConversionModule(HALConverter);
var output = convert.toMessage(domainData);

In that imaginary pseudo-code, the conversion process is accessed via an instance of the conversionModule() that accepts a message-specific converter (in this case, HALConverter) and uses the toMessage function that accepts a domainData instance to produce the desired output. This is all quite vague right now, but at least we have a clear target for implementing safe, cheap, easy support for multiple output formats.

Once the functionality of the internal domain model is cleanly separated from the external format, we need some guidance on how to consistently convert the domain model into the desired output format. But before that, we’ll need a pattern for selecting which format is appropriate.

The Selection Algorithm

An important implementation detail when supporting multiple output formats for an API is that of the output selection process. There needs to be some consistent algorithmic way to select the correct output format at runtime. The good news is that HTTP—still the most common application-level protocol for web APIs—has this algorithm already defined: content negotiation.

Section 3.4 of the HTTP specification (RFC7231) describes two patterns of content negotiation for “representing information”:

Proactive

The server selects the representation based on the client’s preferences.

Reactive

The server provides a list of possible representations to the client and the client selects the preferred format.

The most common pattern in use on the Web today is the proactive one and that’s what we’ll implement in our representor. Specifically, clients will send an Accept HTTP header that contains a list of one or more format preferences, and the server will use that list to determine which format will be used in the response (including the selected format identifier in the server’s Content-Type HTTP header).

A typical client request might be:

GET /users HTTP/1.1
Accept: application/vnd.hal+json, application/vnd.uber+json
...

And, for a service that supports HAL but does not support UBER, the response would be:

HTTP/1.1 200 OK
Content-Type: application/vnd.hal+json
...

It’s All About Quality

The content negotiation examples shown in this book are greatly simplified. Client apps may include several media types in their accept list—even the "*/*" entry (which means “I accept everything!”). Also, the HTTP specification for the Accept header includes what is known as the q parameter, which can qualify each entry in the accept list. Valid values for this parameter include a range of numbers from 0.001 (least preferred entry) to 1 (most preferred entry).

For example, this client request shows that, of the two acceptable formats, the HAL format is the most preferred by this client app:

GET /users/ HTTP/1.1
application/vnd.uber+json;q=0.3,
application/vnd.hal+json;q=1

So, that’s what it looks like on the “outside”—the actual HTTP conversation. But what pattern is used internally to make this work on the server side? Thankfully, a solution for this kind of selection process was worked out in the 1990s.

Adapting and Translating

Many of the challenges of writing solid internal code can be summed up in a common pattern. And one of the most important books on code patterns is the 1994 book Design Patterns by Gamma, Helm, Johnson, and Vlissides (Addison-Wesley Professional). Those are rather tough names to remember and, over time, this group of authors has come to be known as the Gang of Four (GoF). You’ll sometimes even hear people refer to the Gang of Four book when discussing this important text.

Patterns in Architecture

The notion that architecture can be expressed as a common set of patterns was first written about by Christopher Alexander. His 1979 book The Timeless Way of Building (Oxford University Press) is an easy and thought-provoking read on how patterns play a role in physical architecture. It was his work on patterns that inspired the authors of the Design Patterns book and so many other software patterns books.

There are about twenty patterns in the GoF book, categorized into three types:

  • Creational patterns

  • Structural patterns

  • Behavioral patterns

The pattern that will help us in our quest to implement support for multiple output formats for our API that is safe, cheap, and easy is the Adapter structural pattern.

The Adapter pattern

As established on OODesign.com, the intent of the Adapter pattern is to:

Convert the interface of a class into another interface clients expect. Adapter lets classes work together that couldn’t otherwise because of incompatible interfaces.

And that’s essentially what we need to do—convert an internal class (or model) into an external class (or message).

There are four participants to the Adapter pattern (Figure 3-1):

Target

Defines the domain-specific interface that client uses. This will be the message model or media type we want to use for output.

Client

Collaborates with objects conforming to the target interface. In our case, this is our API service—the app that is using the Adapter pattern.

Adaptee

Defines an existing interface that needs adapting. This is the internal model that needs to be converted into the target message model.

Adapter

Adapts the interface of adaptee to the target interface. This will be the specific media-type plug-in that we write to handle the work of converting the internal model to the message model.

So, we need to write an adapter plug-in for each target media type (HTML, HAL, Siren, Collection+JSON, etc.). That’s not too tough. But the challenge is that each internal object model (the adaptee) is going to be different. For example, writing a plug-in to handle Users then writing one to handle Tasks, and on and on. That can mean lots of code is needed to write the adapters and—even more disappointing—these adapters may not be very reusable.

To try to reduce the need for tightly coupled adapter code, I’m going to introduce another pattern—one based on the Adapter pattern: the Message Translator.

rwcl 0301
Figure 3-1. The Adapter pattern

That means we need to spend a few minutes on what the Message Translator pattern looks like and how it can be used to standardize the process of converting internal object models into external message models.

The Message Translator pattern

To cut down on lots of custom adapters, I’m going to introduce another pattern—derived from the Adapter pattern—called the Message Translator. This comes from Gregor Hohpe and his book Enterprise Integration Patterns (Addison-Wesley Professional).

Hohpe describes the Message Translator as:

Use a special filter, a Message Translator, between other filters or applications to translate one data format into another.

A message translator is a special form of the adapter class in the GoF set of patterns.

To make this all work, I’ll introduce a general message format—the Web Service Transition Language (WeSTL)—in the next section of this chapter. That will act as a standardized adaptee and make it possible to generalize the way adapter plug-ins can be coded. Now, the process of translating can be turned into an algorithm that doesn’t need to rely on any domain-specific information. As illustrated in Figure 3-2, we can write WeSTL-to-HAL or WeSTL-to-Cj or WeSTL-to-Siren translators and then the only work is to convert the internal model into a WeSTL message. This moves the complexity to a new location, but does so in a way that reduces the amount of custom code needed to support multiple formats.

So, armed with this background, we can now look at a set of concrete implementation details to make it all happen.

rwcl 0302
Figure 3-2. The Message Translator pattern

A Server-Side Model

In this section, I’ll walk through the high-level details of a working representor implementation: the one that is used in all the services created for this book. Implementing a representor means dealing with the following challenges:

  • Inspecting the HTTP request to identify the acceptable output formats for the current request.

  • Using that data to determine which output format will be used.

  • Converting the domain data into the target output format.

Handling the HTTP Accept Header

The first two items on that list are rather trivial to implement in any WWW-aware codebase. For example, identifying acceptable output formats for a request means reading the Accept HTTP header. Here is a snippet of NodeJS code that does that:

// rudimentary accept-header handling
var contentType = '';
var htmlType = 'text/html';
var contentAccept = req.headers['accept'];
if(!contentAccept || contentAccept==='*/*') {
  contentType = htmlType;
}
else {
  contentType = contentAccept.split(',')[0];
}

Note that the preceding code example makes two key assumptions:

  1. If no Accept header is passed or the Accept header is set to “anything”, the Accept header will be set to text/html.

  2. If the Accept header lists more than one acceptable format, this service will just grab the first one listed.

This implementation is very limited. It does not support the use of q values to help the server better understand client preferences, and this service defaults to the text/html type for API responses. Both of these assumptions can be altered and/or improved through additional coding, but I’ve skipped over that for this book.

Implementing the Message Translator Pattern

Now that we have the requested format value—the output context for this request—we can move on to the next step: implementing the Message Translator pattern in NodeJS. For this book, I’ve created a simple module that uses a switch … case element that matches the request context string (the accepted format) with the appropriate translator implementation.

The code looks like this:

// load representors 1
var html = require('./representors/html.js');
var haljson = require('./representors/haljson.js');
var collectionJson = require('./representors/cj.js');
var siren = require('./representors/siren.js');

function processData(domainData, mimeType) {
  var doc;

  // clueless? assume HTML 2
  if (!mimeType) {
    mimeType = "text/html";
  }

  // dispatch to requested representor 3
  switch (mimeType.toLowerCase()) {
    case "application/vnd.hal+json":
      doc = haljson(object);
      break;
    case "application/vnd.collection+json":
      doc = collectionJson(object);
      break;
    case "application/vnd.siren+json":
      doc = siren(object);
      break;
    case "text/html":
    default: 4
      doc = html(object);
      break;
  }
  return doc;
}

In the preceding code snippet, you can see that a set of representors are loaded at the top (see 1). The code in these modules will be covered in “Runtime WeSTL”). Next (2), if the mimeType value is not passed (or is invalid) it is automatically set to text/html. This is a bit of defensive coding. And then (at 3) the switch … case block checks the incoming mimeType string with known (and supported) mime type strings in order to select the appropriate format processing module. Finally, in case an unknown/unsupported format is passed in, the default statement (4) makes sure that the service runs the html() module to produce valid output.

We now have the basics of the representor outlined. The next step is to actually implement each format-specific translator (HTML, HAL, etc.). To solve this challenge, we need to take a side road on our journey that establishes a general format understood by all translators—the WeSTL format.

General Representor Modules

In the Message Translator pattern, each format module (html(), haljson(), etc.) is an instance of a translator. While implementing these modules as domain-specific converters (e.g., userObjectToHTML, userObjectToHAL, etc.) would meet the needs of our implementation, that approach will not scale over time. Instead, what we need is a general-purpose approach to a translator that will not have any domain-specific knowledge. For example, the translator module used to handle user domain data will be the same translator module used to handle customer and accounting or any other domain-specific domain data.

To do that, we’ll need to create a common interface for passing domain data into format modules that is independent of any single domain model.

The WeSTL Format

For this book, I’ve worked up a common interface in the form of a standardized object model, one that service developers can quickly load with domain data and pass to format modules. I also took the opportunity to reframe the challenge of defining interfaces for web APIs. Instead of focusing on defining resources, I chose to focus on defining state transitions. For this reason, I’ve named this interface design the Web Service Transition Language, or WeSTL (pronounced wehs’-tul).

Basically, WeSTL allows API service developers to use a standardized message model for converting internal object models into external message formats. This reduces the cost (in time and effort) to craft new Translators and pushes the complexity of converting internal models into messages to a single instance—converting from the internal model to WeSTL.

Figure 3-3 provides a visual representation of the request/response life cycle of services that use the WeSTL format.

rwcl 0303
Figure 3-3. The WeSTL transformation cycle

Curious About WeSTL?

I won’t be able to go into too much depth on the design of the WeSTL format in this chapter. I want to focus instead on how we can use WeSTL to drive our general representation module implementation. If you’re curious about the thinking behind the WeSTL, check out the WeSTL Specifications page and the associated online GitHub repo.

When designing and implementing web APIs with WeSTL, the service developer collects all the possible state transitions and describes them in the WeSTL model. By state transitions, I mean all the links and forms that could appear within any service response. For example, every response might have link to the home page. Some responses will have HTML-style input forms allowing API clients to create new service data or edit existing data. There may even be service responses that list all the possible links and forms (state transitions) for the service!

Why State Transitions?

Focusing on state transitions may seem a bit unusual. First, the transition is the thing between states; it leads from one state to another. For example, State A might be the home page and State B might be the list of users. WeSTL documents don’t describe State A or B. Instead, they describe the action that makes it possible to move from State A to State B. But this is also not quite correct. WeSTL documents do not indicate the starting state (A) or the ending state (B)—just one possible way to move from some state to another. This focus on the actions that enable changes of state makes WeSTL handy for creating message translators.

A simple example of how WeSTL can be used to describe transitions is shown here:

{
  "wstl" : {
    "transitions" : [
      {
        "name" : "home", 1
        "type" : "safe",
        "action" : "read",
        "prompt" : "Home Page",
      },
      {
        "name" : "user-list", 2
        "type" : "safe",
        "target" : "user list"
        "prompt" : "List of Users"
      }
      {
        "name" : "change-password", 3
        "type" : "unsafe",
        "action" : "append"
        "inputs" : [ 4
          {
            "name" : "userName",
            "prompt" : "User",
            "readOnly" : true
          },
          {
            "name" : "old-password",
            "prompt" : "Current Password",
            "required" : true,
          },
          {
            "name" : "old-password",
            "prompt" : "New Password (5-10 chars)",
            "required" : true,
            "pattern" : "^[a-zA-Z0-9]{5,10}$"
          }
        ]
      }
    ]
  }
}

As you can see from this WeSTL document, it contains three transition descriptions named home (1), user-list (2), and change-password (3). The first two transitions are marked safe. That means they don’t write any data, only execute reads (e.g., HTTP GET). The third one, however (change-password) is marked unsafe since it writes data to the service (à la HTTP POST). You can also see several input elements described for the change-password transition (4). These details will be used when creating an API resource for the User Manager service.

There are a number of details left out in this simple example, but you can see how WeSTL works; it describes the transitions that can be used within the service. What’s important to note is that this document does not define web resources or constrain where (or even when) these transitions will appear. That work is handled by service developers elsewhere in the code.

So, this is what a WeSTL model looks like at design time, before the service is up and running. Typically, a service designer uses WeSTL in this mode. There is also another mode for WeSTL documents: runtime. That mode is typically used when implementing the service.

Runtime WeSTL

At runtime, an instance of the WeSTL model is created that contains only the valid transitions for a particular resource. This runtime instance also includes any data associated with that web resource. In other words, at runtime, WeSTL models reflect the current state of a resource—both the available data and the possible transitions.

Creating a runtime WeSTL model in code might look like this:

var transitions = require('./wstl-designtime.js');
var domainData = require('./domain.js');

function userResource(root) {
  var doc, coll, data;

  data = [];
  coll = [];

  // pull data for this resource 1
  data = domain.getData('user',root.getID());

  // add transitions for this resource 2
  tran = transitions("home");
  tran.href = root +"/home/";
  tran.rel = ["http:"+root+"/rels/home"];
  coll.splice(coll.length, 0, tran);

  tran = transitions("user-list");
  tran.href = root +"/user/";
  tran.rel = ["http:"+root+"/rels/collection"];
  coll.splice(coll.length, 0, tran);

  tran = transitions("change-password");
  tran.href = root +"/user/changepw/{id}";
  tran.rel = ["http:"+root+"/rels/changepw"];
  coll.splice(coll.length, 0, tran);

  // compose wstl model 3
  doc = {};
  doc.wstl = {};
  doc.wstl.title = "User Management";
  doc.wstl.transitions = coll;
  doc.wstl.data =  data;

  return doc;
}

As the preceding code sample shows, the userResource() function first pulls any associated data for the current resource—in this case, a single user record based on the ID value in the URL (as seen in 1) then pulls three transitions from the design-time WeSTL model (2) and finally composes a runtime WeSTL model by combining the data, transitions, and a helpful title string (3).

And here is some high-level code that converts that runtime WeSTL document into a HAL representation:

var transitions = require('./wstl-designtime.js');
var domainData = require('./domain.js');

function userResource(root) {
  var doc, coll, data;

  data = [];
  coll = [];

  // pull data for this resource 1
  data = domain.getData('user',root.getID());

  // add transitions for this resource 2
  tran = transitions("home");
  tran.href = root +"/home/";
  tran.rel = ["http:"+root+"/rels/home"];
  coll.splice(coll.length, 0, tran);

  tran = transitions("user-list");
  tran.href = root +"/user/";
  tran.rel = ["http:"+root+"/rels/collection"];
  coll.splice(coll.length, 0, tran);

  tran = transitions("change-password");
  tran.href = root +"/user/changepw/{id}";
  tran.rel = ["http:"+root+"/rels/changepw"];
  coll.splice(coll.length, 0, tran);

  // compose wstl model 3
  doc = {};
  doc.wstl = {};
  doc.wstl.title = "User Management";
  doc.wstl.transitions = coll;
  doc.wstl.data =  data;

  return doc;
}

As this code sample shows, the userResource() function first pulls any associated data for the current resource—in this case, a single user record based on the ID value in the URL (1) then pulls three transitions from the design-time WeSTL model (2), and finally composes a runtime WeSTL model by combining the data, transitions, and a helpful title string (3).

It should be pointed out that the only constraint on the wstl.data element is that it must be an array. It can be an array of JSON properties (e.g., name–value pairs), an array of JSON objects, or even an array of one JSON object that is, itself, a highly nested graph. The WeSTL document may even include a property that points to a schema document (JSON Schema, RelaxNG, etc.) describing the data element. The related schema information can be used by the format module to help locate and process the contents of the data element.

So, WeSTL documents allow service developers to define web resources in a general way. First, service designers can create design-time WeSTL documents that describe all the possible transitions for the service. Second, service developers can use the design-time document as source material for constructing runtime WeSTL documents that include selected transitions plus associated runtime data.

Now we can finally write our general format modules.

A Sample Representor

Now that resources are represented using a generic interface using WeSTL, we can build a general format module that converts the standardized WeSTL model into an output format. Basically, the code accepts a runtime WeSTL document and then translates domain data, element by element, into the target message format for an API response.

To see how this might look, here is a high-level look at a simplified implementation of a general HAL representor.

Warning

The example representor shown here has been kept to the bare minium to help illustrate the process. A fully functional HAL representor will be covered in Chapter 4, HAL Clients.

function haljson(wstl, root, rels) { 1
  var hal;

  hal = {};
  hal._links = {};

  for(var segment in wstl) {
    hal._links = getLinks(wstl[segment], root, segment, rels);
    if(wstl[segment].data && wstl[segment].data.length===1) {
      hal = getProperties(hal, wstl[segment]);
    }
  }
  return JSON.stringify(hal, null, 2); 4
}

// emit _links object 2
function getLinks(wstl, root, segment, relRoot) {
  var coll, items, links, i, x;

  links = {};

  // list-level actions
  if(wstl.actions) {
    coll = wstl.transitions;
    for(i=0,x=coll.length;i<x;i++) {
      links = getLink(links, coll[i], relRoot);
    }

    // list-level objects
    if(wstl.data) {
      coll = wstl.data;
      items = [];
      for(i=0,x=coll.length;i<x;i++) {
        item = {};
        item.href = coll[i].meta.href;
        item.title = coll[i].title;
        items.push(item);
      }
      links[checkRel(segment, relRoot)] = items;
    }
  }
  return links;
}

// emit root properties 3
function getProperties(hal, wstl) {
  var props;

  if(wstl.data && wstl.data[0]) {
    props = wstl.data[0];
    for(var p in props) {
      if(p!=='meta') {
        hal[p] = props[p];
      }
    }
  }
  return hal;
}

/* additional support functions appear here */

While this code example is just a high-level view, you should be able to figure out the important details. The first argument of the top level function (haljson()) accepts a WeSTL model along with some runtime request-level data (1). That function “walks” the WeSTL runtime instance and (a) processes any links (transitions) in the model (2) and then (b) deals with any name–value pairs in the WesTL instance (3). Once all the processing is done, the resulting JSON object (now a valid HAL document) is returned to the caller (4). An example of what this code might produce follows:

{
  "_links" : { 1
    "self" : {
      "href": "http://localhost:8282/user/mamund"
    },
    "http://localhost:8282/rels/home": {
      "href": "http://localhost:8282/",
      "title": "Home",
      "templated": false
    },
    "http://localhost:8282/rels/collection": {
      "href": "http://localhost:8282/user/",
      "title": "All Users",
      "templated": false
    },
    "http://localhost:8282/rels/changepw": {
      "href": "http://localhost:8282/user/changepw/mamund",
      "title": "Change Password"
    }
  },
  "userName": "mamund", 2
  "familyName": "Amundsen",
  "givenName": "Mike",
  "password": "p@ss",
  "webUrl": "http://amundsen.com/blog/",
  "dateCreated": "2015-01-06T01:38:55.147Z",
  "dateUpdated": "2015-01-25T02:28:12.402Z"
}

Now you can see how the WeSTL document has led from design-time mode to runtime instance and finally (via the HAL translator module) to the actual HAL document. The WeSTL transitions appear in the HAL _links section (1) and the related data for this user appears as name–value pairs called properties in HAL documents (starting at 2).

Of course, HAL is just one possible translator implementation. Throughout this book, you’ll find general message translators for a handful of formats. Hopefully, this short overview will give enough guidance to anyone who wishes to implement their own (possibly better) general representors for HAL and many other registered formats.

Summary

This chapter has been a bit of a diversion. I focused on the server-side representor even though the primary aim of this book is to explore client-side hypermedia. But the representor pattern is an important implementation approach and it will appear many times in the code examples throughout the book. We’ve built up a working example of a representor by taking lessons from Parnas’s “responsibility assignment” approach to modularity, the content negotiation features of HTTP, and Gregor Hohpe’s message translator pattern.

Not too shabby for a diversion.

References

  1. Standard Generalized Markup Language (SGML) is documented in ISO 8879:1986. It is, however, based on IBM’s GML format from the 1960s.

  2. The “Simple Object Access Protocol (SOAP) 1.1” specification was published as a W3C Note in May 2000. In the W3C publication model, Notes have no standing as a recommendation of the W3C. It wasn’t until the W3C published the SOAP 1.2 Recommendation in 2003 that SOAP, technically, was a “standard.”

  3. Crockford’s 50-minute talk “The JSON Saga” was described as “The True Story of JSON.” An unoffical transcript of this talk is available online.

  4. CSV was first specified by the IETF in RFC4180 “Common Format and MIME Type for Comma-Separated Values (CVS) Files” in 2005. This was later updated by RFC7111 (to add support for URI fragments) and additional CSV-related efforts have focused on supporting additional semantics.

  5. You can learn about the binary formats mentioned in this chapter by visiting Fast Infoset, Avro, Thrift, and ProtoBuf.

  6. The “Atom Syndication Format” RFC4287) and the “Atom Publishing Protocol” (RFC5023) form a unique pair of specifications that outline both the document format and read/write semantics in different RFCs. There are also a handful of RFCs defining Atom format extensions.

  7. The YAGNI maxim is described in Ron Jeffries’s blog post.

  8. Mike Kelly’s Hypertext Application Language (HAL) has proven to be one of the more popular of the hypermedia type formats (as of this writing).

  9. RFC6570 specifies URI Templates.

  10. The Collection+JSON format was registered with the IANA in 2011 and is “a JSON-based read/write hypermedia type designed to support management and querying of simple collections.”

  11. Both Siren and Zetta are projects spearheaded by Kevin Swiber.

  12. As of this writing, the Universal Basis for Exchanging Representations (UBER) is in stable draft stage and has not been registered with any standards body.

  13. Beck’s 1997 article “The Past and Future of Cognitive Therapy” describes his early experiences that led to what is now known as cognitive reframing.

  14. Parnas’s “On the Criteria to be Used in Decomposing Systems into Modules” is a very short (and excellent) article written for the ACM in 1972.

  15. Details on HTTP content negotiation are covered in Section 3.4 of RFC7231. One of a series of HTTP-related RFCs (7230 through 7240).

  16. The full title of the “Gang of Four” book is Design Patterns: Elements of Reusable Object-Oriented Software by Eric Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison-Wesley Professional).

  17. A good source for learning more about Christopher Alexander and his work can be found at the Pattern Language website.

  18. Gregor Hohpe covers message translator in his book Enterprise Integration Patterns (Addison-Wesley Professional).

Image credits

  • Diogo Lucas, Figures 3-1 and 3-2

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.34.205