Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10. Deploying and Releasing from the Pipeline

With firm foundations in automating builds and continuously integrating code, we can now focus on the delivery of valuable software to various environments, including production. A key lesson that you will learn is that today’s business requirements and popular software architecture practices strongly encourage you to separate the processes of deployment (a technical activity, as you will see) and release (a business activity); in fact, we will talk about deploying an application, but releasing a feature. This has ramifications for the way you design, test, and continuously deliver software.

One aspect of software development that becomes critical after you start to consider different environments is configuration. The need to track different configuration values depending on the environment that you are using (development, test, pre-production, production...) isn’t new, but tracking all these has become much harder with the advent of cloud-based platforms, since you may not know a priori where your application might be running. On top of this, a continuous delivery process might require frequent changes in configuration, meaning configuration management has to be at the heart of your deployment and release strategies.

Deploying, releasing, and managing configuration are some of the most challenging aspects of continuous delivery, and there is a lot of ground to cover. To help you go through all of it, we have created the Extended Java Shop application, which will demonstrate many of the concepts outlined in this chapter.

Introducing the Extended Java Shop Application

The example application presented in “Introducing the “Docker Java Shop” Sample App” was used to demonstrate how to work locally using Docker containers and Kubernetes. In this chapter and in Chapter 11, we will use an extended version of this application, called the Extended Java Shop, together with a small external library known as java-utils. The Extended Java Shop also includes three prebuilt Jenkins pipelines that demonstrate how the different phases of testing are linked together, and how deployments can be made to different platforms.

The particulars of each part of the Extended Java Shop and its supporting libraries are explained in detail in the relevant sections that follow, but a general overview is illustrated in Figure 10-1. Note that the purpose of this sample application is to demonstrate concepts related to deploying, releasing, testing, and managing configuration, but this is not necessarily a production-ready application. Shortcuts have been taken for simplicity; wherever possible, these shortcuts will be highlighted, indicating how a production-ready application would be constructed.

The repository for the Extended Java Shop is a monorepo that includes the following:

Owned services

Shopfront: The website that the user visits. This serves the same function as in the Docker Java Shop, with the exception that here it also communicates to a Feature Flags service and an Adaptive Pricing service.
Product Catalogue: Holds information about each different product. It holds data in an in-memory database (in a real case scenario, the database would be real). Similar to its counterpart in the Docker Java Shop.
Stock Manager: Holds available amounts for each product. It holds data in an in-memory database (in a real case scenario, the database would be real). Similar to its counterpart in the Docker Java Shop.
Feature Flags: Holds information about the different feature flags and their activation levels. It stores its data in a real PostgreSQL database. New with regards to the Docker Java Shop.

“Third-party” services

Adaptive Pricing: This is meant to represent a service provided by a third-party entity that the Shopfront service communicates with; in a real scenario, it wouldn’t be part of our repository, nor would we have control over it. New with regards to the Docker Java Shop.
Fake Adaptive Pricing: This is a “fake” Adaptive Pricing service, the sort a team would create to be able to test integration with a third party. This therefore would be part of our repository, and we’d have control over it. New with regards to the Docker Java Shop.

Databases

Feature Flags DB: The production database used by the Feature Flags service. This is a PostgreSQL database running in a Docker container, although in a real-case scenario, the database wouldn’t run in a container. New with regards to the Docker Java Shop.
Test Feature Flags DB: The test database used by Feature Flags service, also a PostgreSQL database running in a Docker container, but using different credentials. New with regards to the Docker Java Shop.

Acceptance tests

A set of tests that puts all the owned services together and verifies that they work correctly. More on this in Chapter 11. New with regards to the Docker Java Shop.

Pipelines

Jenkins Base: A prebuilt pipeline that automatically builds all the services and databases mentioned previously upon code changes, and then runs acceptance tests where needed. It includes a dummy deployment job (it doesn’t really deploy anywhere). New with regards to the Docker Java Shop.
Jenkins Kubernetes: An extension of Jenkins Base, where the deployment job has been overwritten to deploy services to a Kubernetes cluster. New with regards to the Docker Java Shop.
Jenkins AWS ECS: A different extension of Jenkins Base, where the deployment job has been overwritten to deploy services to Amazon’s Elastic Container Service. New with regards to the Docker Java Shop.

Exploring Deploy and Release for Serverless and IaaS

Because of the static nature and size limitations of a print book, this chapter focuses exclusively on the deployment and release of applications by using the technologies that appear to be the most popular at the time of writing: Docker, Kubernetes, and AWS ECS. The accompanying GitHub repository contains more practical examples using serverless and IaaS technologies that have been talked about in previous chapters.

Separating Deployment and Release

Deployment and release are concepts that many people use as synonyms. However, in the context of continuous delivery, they mean different things:

Deployment: A technical term that refers to the act of making a new binary package of a service available in production
Release: A business term that refers to the act of making a particular functionality available to users

Deployments and releases many times happen at the same time—typically, when a new piece of functionality is implemented and then released into production as part of a deployment—but you can have one without the other. For instance, when a developer refactors a particular section of code without altering its functionality, and deploys this new version of code to production, we have a deployment without a release. You could also add new functionality but hide it away under a feature flag (see “Feature Flags”), in which case you also have a deployment without a release. On the other hand, if a feature is hidden behind a feature flag, and feature flags can be modified without deploying, then by altering the feature flag, you could be releasing a new functionality without the need of a deployment.

Understanding that deploying and releasing are two related but independent activities is crucial in order to create an environment for continuous delivery: it gives the development team the freedom to deploy new versions of the software as they need to, while it allows the product owners to keep full control over the features that are available to users. Because they are indeed different activities, they require different tools and techniques to make them happen in an effective manner; this is the focus of the next sections.

Deploying Applications

Although some aspects of the deployment will differ, depending on the platform that you are deploying services to, some concerns will be the same, regardless of said platform. In general, the deployment of an application will be influenced by the following activities:

Creating the releasable artifact: This is the binary file that will contain your application code (and potentially configuration—more on this in “Managing Configuration and Secrets”), and that will be sent to the machines where the application will be running. The releasable artifact can take many shapes (fat JAR, WAR, EAR, Buildpack, Docker container image, etc.), and the most suitable one will depend on the platform where your application is being deployed. In this book, we have decided to focus on Docker container images because of their flexibility and popularity, but you are free to try out others if you so wish.
Automating deployments: Years ago, when deployments happened only once a month, you could have a person manually copying your applications to the target server and restarting the application. With continuous delivery, you are potentially deploying a dozen times a day, and this is no longer practical.
Setting up health checks: An important drawback of the microservices architecture and cloud platforms is the increase in the number of moving parts. And more moving parts means a higher probability that something will go wrong. Your applications will fail, and you need to be able to detect it and fix it.
Choosing a deployment strategy: Whenever you need to make a new version of your application available to the public, there is the conundrum of how you are going to coordinate that with the removal of the existing version, especially if you run multiple parallel instances for reliability. A trade-off between complexity and functionality will need to be struck.
Implementing your deployment strategy: Latest cloud platforms can manage a cluster of machines and the deployment of your applications across them transparently, meaning you have to worry only about picking the strategy. However, if you are not in one of these platforms, you will have to implement the strategy yourself.
Working with databases: Continuous delivery affects everything, even databases, and changes will have to be brought about there, too. The process cannot be stopped for anything, so schema changes and data migrations will have to be executed from the pipeline as well.

Daunting as it may sound, managing the preceding factors is the key to unlocking the benefits of continuous delivery. Each activity has its own complexities and decision points; during the rest of this section, we will go through them and give you everything you need to set up your pipeline.

Creating a Container Image

Although packaging applications in a Docker image is certainly not the only way to deploy services into production, it is among the most popular ones. It is therefore useful to know how to create and publish a Docker image for your applications as part of the build pipeline, because then you will be able to use that image to deploy services. The image creation process is no different from the one outlined in “Creating Container Images with Docker”, although in the case of the build pipeline, you also have the option to wrap the command-line orders into a plugin.

This is the option that has been taken in the sample pipelines available in the Extended Java Shop repository, more particularly in folders jenkins-base, jenkins-kubernetes, and jenkins-aws-ecs. (See the relevant README.md files to execute these examples locally.) The creation and publication of the container image can then be performed with the following steps.

Installing the plugin

In this case, we are making use of the CloudBees Docker Build and Publish plugin, which you can install either using the graphical interface (from the main page, choose Manage Jenkins and then Manage Plugins), or using the command line while logged into the Jenkins server:

/usr/local/bin/install-plugins.sh docker-build-publish

You need to restart Jenkins after this to activate the plugin.

Creating the DockerHub credentials

Unless you are publishing your Docker container images to a private repository that doesn’t require authentication, you need to provide Jenkins with some kind of authentication mechanism when pushing each new container image. In fact, the CloudBees Docker Build and Publish plugin uses Docker Hub by default, so you need a Docker Hub account to proceed. Here are basic steps to do this in Jenkins:

Go to Credentials and then Add Credentials, as shown in Figure 10-2.

Figure 10-2. Creating new credentials in Jenkins
Select “Username with password” from the Kind drop-down menu.
Leave the scope as Global or restrict it to something more specifically aligned to your own security policies.
Give your credentials a meaningful name in the ID field, like DockerHub, and optionally a description.
Enter the username used to publish images.
Enter the password and then click OK. Figure 10-3 shows the selection of these options.

Figure 10-3. Adding new Docker Hub credential to Jenkins

Building and publishing

Finally, you can now create a step within your job definition to build the Docker container image and publish it into Docker Hub (or any other registry):

When adding the new step, select the type Docker Build and Publish to open the window shown in Figure 10-4.
Indicate the name that you want to give to the published image; this is equivalent to using the -t option in the command line when building the Docker image.
Indicate the credentials to use when pushing the image to Docker Hub.

Figure 10-4. Creating a new build step to build and publish a Docker image
If the necessary Dockerfile isn’t in the root folder of your repository, indicate the folder where it is to be found, as shown in Figure 10-5.

Figure 10-5. Indicating the folder where the Dockerfile resides

With this, your build pipeline can now create Docker images for your applications in an automated manner, and you are ready to deploy them to your platform of choice.

Deployment Mechanisms

The first point to cover is the mechanism by which the service makes its way into production—in other words, the mechanism by which you communicate to the relevant platform and indicate that a new version is available. Once you know this, you can then configure your CD pipeline to do this automatically for you.

Most platforms out there, including Kubernetes, Amazon, and Cloud Foundry, have a RESTful API as their main means of communication that you can use to manage your deployments. However, this is not what most people use. In the same way that they provide a RESTful API, they also provide the following tools, built on top of this API, that make interaction easier:

A graphical interface: Usually in the form of a website, this is where deployments can be performed and managed. This is useful to get comfortable with the way a new platform works or to check the status of the platform at a quick glance, but not for deployment automation.
A command-line interface: Frequently including Bash completion, this interface acts as a wrapper for the RESTful API. The advantage of the CLI is that commands can easily be scripted, making it suitable for build automation.

On top of this, some platform providers, or sometimes even third parties, can develop plugins for different build automation tools that leverage either the RESTful API or the CLI to provide common use cases in a pipeline-friendly manner.

The plugin example: Kubernetes

As explained in Chapter 4, Kubernetes is an orchestrator platform for deploying immutable services encapsulated in containers. In Chapter 8, we already indicated how Docker and Kubernetes could be used to allow developers to work locally, mimicking a production environment. Now we will indicate how this deployment can be automated to deploy to the production Kubernetes cluster by using a plugin in Jenkins. A fully functional example is available at the Extended Java Shop repository—more precisely, at the jenkins-kubernetes folder. You can follow the instructions at the README.md file in that folder to run the example locally. (Incidentally, this example will also make use of a locally running minikube instance, but the process will be no different from a real-life Kubernetes cluster.)

Installing the plugin

First, you need to install the Kubernetes CD Jenkins plugin. Once again, this can be done using the graphical interface in Jenkins (from the main page, choose Manage Jenkins and then Manage Plugins) or via the command line at the Jenkins server:

/usr/local/bin/install-plugins.sh kubernetes-cd

You need to restart Jenkins after this to activate the plugin.

Kubernetes Plugin Versus Kubernetes CD Plugin

The Jenkins Plugin repository includes two similarly named but fundamentally different plugins: Kubernetes and Kubernetes CD. The former allows you to create additional Jenkins nodes in an existing Kubernetes cluster so as to assist on build execution, while the latter allows you to deploy your applications into a Kubernetes cluster. We are referring to the latter here.

Preparing the configuration files

Before diving into details, it’s necessary to clarify some extra Kubernetes concepts:

Cluster: A cluster is a particular collection of nodes, driven by a single master, where Kubernetes can choose to deploy containers. You can have multiple clusters, for instance, if you want to completely separate test and production.
User: The Kubernetes cluster will expect some form of authentication to make sure that the requested operation (deploy, undeploy, rescale, etc.) is allowed for the specific actor; this is managed through users.
Namespace: A named subsection of the cluster, almost like a “virtual cluster.” Users can have different permissions to execute operations, depending on the namespace. It’s an optional parameter, and if unspecified, the default namespace will be used.
Context: A particular combination of cluster, user, and namespace.
Kubeconfig: A file indicating the cluster(s) that is available, the namespaces within it, the user(s) that can access it, and the known combinations of access named as contexts. The Kubeconfig file can also have information regarding how the user is to be authenticated.

With this, and assuming the creation and publication of Docker images is already set up as indicated in the previous section, you can configure an automatic step to deploy to a Kubernetes cluster by following these steps:

Ask the Kubernetes administrator to create a user specific for deployment. Users can be authenticated by providing username and password, or by means of certificates; for the deployment user, the second option will be preferred to facilitate automatic deployment.
Obtain the certificate and key files from the Kubernetes administrator for the deployment user; these will typically be ca.crt, client.crt, and client.key.
Copy these files into the Jenkins server and place them in a recognizable location; for instance, /var/jenkins_home/kubernetes/secrets.

Prepare the kubeconfig file; the details may depend upon your particular installation, but the simplest form can be as follows:

apiVersion: v1
clusters:
- cluster:
    certificate-authority: %PATH_TO_SECRETS%/ca.crt
    server: https://%KUBERNETES_MASTER_IP%:8443
  name: %CLUSTER_NAME%
contexts:
- context:
    cluster: %CLUSTER_NAME%
    user: %DEPLOYMENT_USER%
  name: %CONTEXT_NAME%
current-context: %CONTEXT_NAME%
kind: Config
preferences: {}
users:
- name: %DEPLOYMENT_USER%
  user:
    client-certificate: %PATH_TO_SECRETS%/client.crt
    client-key: %PATH_TO_SECRETS%/client.key

Copy the kubeconfig file into the Jenkins server and place it in a recognizable location; for instance, /var/jenkins_home/kubernetes.

Registering Kubernetes credentials

Now that all the relevant configuration is available in the Jenkins server, it is important to let Jenkins know how to use it. For this, you will create a Kubernetes Credentials record, similar to the DockerHub credentials that you created previously:

Go to Credentials and then Add Credentials.
Select Kubernetes Configuration (Kubeconfig) from the drop-down menu.
Leave the scope as Global or restrict it to something more specifically aligned to your own security policies.
Give your credentials a meaningful name, like kubernetes, and optionally a description.
Indicate that the kubeconfig is in a file on the Jenkins master, and indicate the path where you previously saved the file; then click OK. Figure 10-6 illustrates these settings.

Figure 10-6. Adding new Kubernetes credentials to Jenkins

Creating service definitions

You will need to create service definitions for all your services, much in the same way that you did in “Deploying into Kubernetes” to deploy to the local Kubernetes; in fact, you can probably add those files to your version-control system and reuse them. As an example, the service definition file for the Feature Flags service in the Extended Java Shop is replicated in Example 10-1; further examples can be found in jenkins-kubernetes/service-definitions.

Example 10-1. Kubernetes service definition sample for Feature Flags service in Extended Java Shop

---
apiVersion: v1
kind: Service
metadata:
  name: featureflags
  labels:
    app: featureflags
spec:
  type: NodePort
  selector:
    app: featureflags
  ports:
  - protocol: TCP
    port: 8040
    name: http

---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: featureflags
  labels:
    app: featureflags
spec:
  replicas: 1
  selector:
    matchLabels:
      app: featureflags
  template:
    metadata:
      labels:
        app: featureflags
    spec:
      containers:
      - name: featureflags
        image: quiram/featureflags
        ports:
        - containerPort: 8040
        livenessProbe:
          httpGet:
            path: /health
            port: 8040
          initialDelaySeconds: 30
          timeoutSeconds: 1

Creating the deployment job

Finally, your Jenkins server is ready to configure the deployment job. For this, you can create a new Freestyle element as indicated in “Jenkins”, configure the repository location, and add a build step of type Deploy to Kubernetes, as shown in Figure 10-7.

The configuration of the Deploy to Kubernetes step will need references to only two of the elements previously constructed: the Kubeconfig, for which you can select the Kubeconfig Credentials that was created previously (“kubernetes” in Figure 10-8), and the path to the service definition within the repository. You can now save this job, and you will have an automated way to deploy your services to Kubernetes.

You can then repeat this step for each of the services, or you can tweak the existing one to take a parameter that indicates the service to deploy; this is the course that was chosen in the example available at the jenkins-kubernetes folder.

The CLI example: Amazon ECS

One of the advantages of using plugins in your build pipeline of choice is that they are usually nicer and more user-friendly to operate with. On the other side, one of the disadvantages is the lack of portability: if you ever want to change to a different automated build platform, you’ll probably have to start from scratch. This is one of the reasons that you may choose to leave plugins aside and use command-line tools instead.

This is the option that has been chosen in the second of our examples: deploying to Amazon Elastic Container Service (ECS). Amazon ECS is in many ways similar to Kubernetes, in the sense that Amazon provides a managed cluster of computers and orchestrates the deployment of containers across them; this way, you just need to provide Amazon ECS with the Docker image information, and the platform will do the rest. The main difference is that, in Kubernetes the nodes in the cluster can either be physical or virtual machines, but in Amazon ECS, the computers must be Amazon EC2 instances, keeping everything within the Amazon ecosystem. This means that adding or removing EC2 instances to an Amazon ECS cluster is a streamlined operation, although it adds the risk of vendor lock-in.

Combining Kubernetes and Amazon EC2

You can build a Kubernetes cluster by using Amazon EC2 instances as nodes, which might be a good middle step if you are already using one of these platforms and considering switching to the other one.

The setup and management of an Amazon ECS cluster is beyond the scope of this book (just as setting up and managing a Kubernetes cluster also is), so in this section we will assume the Amazon ECS cluster is already available and we will just focus on how to deploy services to it. The fully functional example available at the Extended Java Shop repository (more precisely, at the jenkins-aws-ecs folder) does include scripts to create and configure a minimal Amazon ECS cluster; readers can follow the instructions at the README.md to run the example locally and check the relevant scripts to know more.

Going Serverless with ECS

Although we have been talking about using ECS on top of EC2 instances, there is also the option to go serverless with ECS Fargate. This option eliminates the need to manage a cluster, choose computer instance characteristics, etc. The cluster will manage all of this, and you’ll just need to provide the container images. There is a catch, though: as of the time of writing this book, ECS Fargate is still being rolled out and is available in only a handful of regions (Northern Virginia, Ohio, Oregon, and Ireland). Depending on when you are reading this, ECS Fargate may be an option to explore.

Installing and configuring the CLI

Installing the AWS CLI is relatively straightforward, and can be done either by getting the packages from the official AWS CLI installation page, or by using a package manager of choice (yum, apt-get, homebrew, etc.). Configuring it requires a couple more steps, which are explained in detail at the AWS CLI configuration page, but essentially boils down to the following:

Create an AWS user to execute deployments.
Obtain an AWS Access Key and an AWS Secret Access Key for that user.
Log into your build server (Jenkins, for instance).
Execute aws configure and provide the previous details as requested.

After the preceding steps have been performed, every aws command that is introduced at the command line will run against your AWS environment.

Getting the Latest AWS CLI

Many package managers like homebrew or yum include AWS CLI in their repositories, making it easy to install. However, there will be a necessary lag between the latest existing version as provided by AWS and the latest available via these package managers. If you use only relatively established features, this will be fine, but if you need the latest, then you’ll have to install AWS CLI from the official source.

Amazon ECS concepts

At this point, we need to define some of the terms used in the Amazon ECS environment. These shouldn’t be difficult to grasp, though, as they are quite similar to their Kubernetes counterparts:

Cluster: The collection of all computers and the services that it runs.
Instance: Each of the EC2 computers that have been included into a cluster.
Service: An application that has been deployed to a cluster.
Task: Each of the individual copies of the Docker containers of an application that are in execution. The same service may have multiple identical tasks across instances, typically with a limit of at most one per instance, although tasks of different services can share an instance.
Task definition: The template from which a task is created; this includes details such as the location of the Docker image, but also the amount of memory or CPU that the task is allowed to use, the ports that it needs to expose, etc. Task definitions are referred to by using their family (a name) and their version.

There can be multiple task definitions for the same service, but a running service can be associated with only a single task definition at a time. This way, deploying a new version of an application will be done by creating a new task definition that refers to the new version, and then updating the service to associate it to it.

Note also that, while services and tasks run in a specific cluster, task definitions are cluster-independent, and can, in fact, be used across clusters. This allows you to have different clusters for test and production and make sure that the task definitions are consistent across environments.

Creating tasks, deploying services

Now that you know the basic nomenclature of Amazon ECS, you can explore the minimum commands that will allow you to deploy services to a cluster. The AWS CLI reference documentation is the best place to investigate how to go further.

Before you can even create a service, you need to create a task definition. This can be done using the subcommand register-task-definition:

aws ecs register-task-definition 
    --family ${FAMILY}  # The family of the task, i.e. the name
    --cli-input-json file://%PATH_TO_JSON_FILE%  # File with the definition
    --region ${REGION} # Region for the task definition, default if omitted

The file at file://%PATH_TO_JSON_FILE% is what contains the actual definition, and may look like the one used for the Shopfront service in the Extended Java Shop application, displayed in Example 10-2.

Example 10-2. Amazon ECS task definition for Shopfront service in Extended Java Shop

{
  "family": "shopfront",
  "containerDefinitions": [
    {
      "image": "quiram/shopfront",
      "name": "shopfront",
      "cpu": 10,
      "memory": 300,
      "essential": true,
      "portMappings": [
        {
          "containerPort": 8010,
          "hostPort": 8010
        }
      ],
      "healthCheck": {
        "command": [ "CMD-SHELL", "curl -f http://localhost:8010/health || exit 1" ],
        "interval": 10,
        "timeout": 2,
        "retries": 3,
        "startPeriod": 30
      }
    }
  ]
}

Once you have created the first task definition, you can then create your service by using the create-service subcommand:

aws ecs create-service 
    --service-name ${SERVICE_NAME}  # Name for the service
    --desired-count 1  # Desired number of tasks when this service is running
    --task-definition ${FAMILY}  # Family of the task definition to use
    --cluster ${CLUSTER_NAME}  # Cluster where the service is to be created
    --region ${REGION} # The region where the cluster is, default if omitted

Finally, to deploy a new version, you can create a new task definition that points to the new version of the Docker image and then update the running service to use the new version of the task definition with the subcommand update-service; note that, since you can have multiple versions of a task definition, this has to be specified using both family and version:

aws ecs update-service 
    --service ${SERVICE_NAME}  # Name of the service to update
    --task-definition ${FAMILY}:${VERSION}  # Task definition to use
    --cluster ${CLUSTER_NAME}  # Cluster where the service currently runs
    --region ${REGION} # The region where the cluster is, default if omitted

Once your scripts are clear, you can create a job in your automated build platform of choice that simply runs them. Ideally, you will also store this script in your version-control system, so you can track changes as your needs evolve. A full example, including some conditional logic to decide when to create and when to update a service, is available at the Extended Java Shop repository, located at jenkins-aws-ecs/deploy-to-aws-ecs.sh.

It All Starts (and Ends) with Health Checks

Before continuous delivery became a standardized practice, when organizations still deployed their applications to production manually, there was a clear way to check that everything was working correctly after deployment: a manual check. But now, with automated deployments that happen typically several times a day, and each service being deployed to potentially multiple instances to provide horizontal scalability, checking services manually is not an option.

What’s more, in a world of autoscaling, deployments may be happening at any time without our awareness: a peak in demand might signal the orchestrating platform of the need of additional resources, which the platform may respond to by deploying new copies of a particular service. Once again, you cannot be expected to manually check that these newly deployed services are working as expected.

Finally, there is another reason for an automated means to check that your services are running correctly. Modern microservices architectures provide you with unprecedented levels of flexibility, but at the cost of having to manage additional moving parts. With the increase in the number of components, the probability of a failure occurring anywhere in the system increases until it becomes an inevitability: a hardware failure, a lost communication link, a deadlock in a kernel, etc., are just some events that could bring a node down, and with it, all the services that are running within. You could be losing services at any point, and you need to detect when this happens and repair it.

This all leads to the concept of the health check. A health check is a purpose-built interface in a service, for instance a /health endpoint in a RESTful API, that is used by the service to indicate its internal status. When invoked, this interface may run some quick checks to verify that everything is working fine, and then provide either a positive or negative response. The orchestrating platform can then be configured to regularly consult the health checks of all the different instances of a service and act accordingly:

If the service responds with a positive outcome, the instance is healthy.
If the service responds with a negative outcome, the instance is unhealthy.
If the service doesn’t even respond to the health check, the instance is unhealthy.

One needs to be careful with health checks, though. An instance appearing to be unhealthy at a particular moment in time is not necessarily indicative of an issue; after all, short glitches happen all the time. However, if an instance appears to be unhealthy too many times in a row, or too many times within a time window, the orchestrating platform can then deduce that the instance is faulty, bring it down, and re-create another one somewhere else. This self-healing mechanism adds resiliency to your platform, compensating for the added uncertainty of the continuous redeployments and the increased number of moving parts.

Providing health-check endpoints

Health-check endpoints have become so ubiquitous that multiple tools and frameworks can automatically attach them to your services, so you don’t even need to create one yourself. This is the path that has been taken in the Extended Java Shop sample application, with two available variants.

Most of the web services in the Extended Java Shop are based in Spring Boot, which automatically adds a /health endpoint to any service without further action. This can be noticed in the log as the application starts up, and can be verified by simply contacting /health in the desired service after it has been deployed. For more details, see Example 10-3, an extract (edited and broken into multiple lines for readability) from the log of the Stock Manager service, where you can see that /health and /health.json have been automatically registered.

Example 10-3. Partial extract of the startup log for Stock Manager service

2018-05-02 11:49:37.487  INFO 56166 --- [main] o.s.b.a.e.mvc.EndpointHandlerMapping:
	Mapped "{[/health || /health.json],methods=[GET],
	produces=[application/vnd.spring-boot.actuator.v1+json || application/json]}"
	onto public java.lang.Object org.springframework.boot.actuate.endpoint.mvc.
	HealthMvcEndpoint.invoke(
	javax.servlet.http.HttpServletRequest,java.security.Principal

A different example is shown in the Product Catalogue service, which is a web service based in Dropwizard, as opposed to Spring Boot. Setting up a health check with Dropwizard requires a couple of steps, but it also comes with greater flexibility. The first step is to create a class that overrides HealthCheck, implementing the check() method; this is done in the BasicHealthCheck class, which is shown in Example 10-4. (In this case, the check simply returns the version number of the running application.)

Example 10-4. A basic health check in Dropwizard

public class BasicHealthCheck extends HealthCheck {
    private final String version;

    public BasicHealthCheck(String version) {
        this.version = version;
    }

    @Override
    protected Result check() throws Exception {
        return Result.healthy("Ok with version: " + version);
    }
}

Once the health check has been created, you need to register it in your application. This is shown in the ProductServiceApplication class, with the specific line required for registration copied next for reference:

final BasicHealthCheck healthCheck = new BasicHealthCheck(config.getVersion());
environment.healthChecks().register("healthCheck", healthCheck);

The advantage of this approach is that multiple health checks can be created and registered, all of them consulted when calling /healthcheck.

Dropwizard Exposes Health Checks at a Different Port

Dropwizard differentiates between normal user traffic and what it considers admin traffic; health checks are registered as admin traffic. Endpoints for admin don’t listen in the default port, but in what is called the admin port. By default, the admin port is the normal port +1 (e.g., if the application is listening on 8020, the admin port will by default be 8021), but this can be overridden. See product-catalogue.yml for an example on how to define your own admin port.

Needless to say, one doesn’t need to be constrained to whatever framework is in use, and can choose to build health-check endpoints manually. At the end of the day, the health check is just an ordinary endpoint, which can be created just like any other endpoint in the application.

Consulting health-check endpoints

Creating health-check endpoints in your services is one side of the coin, and configuring the orchestrating platform to use them is the other one. This, again, is pretty straightforward with most orchestrating platforms.

Let’s check our two examples: Kubernetes and Amazon ECS. With Kubernetes, the health check is configured at the very service definition. In fact, keen readers might have noticed the following section in Example 10-1:

livenessProbe:
  httpGet:
    path: /health
    port: 8040
  initialDelaySeconds: 30
  timeoutSeconds: 1

The parameters indicated here, together with others that have been omitted and for which default values are being used, tell Kubernetes the strategy to follow when checking the health of our service instances. Let’s check these values in detail:

initialDelaySeconds: It is understood that services need time to become fully operational after they’ve been deployed. This value represents how long Kubernetes waits before it starts checking the health of a service instance.
timeoutSeconds: The maximum time we expect the health check to take.
periodSeconds: The time between two consecutive health checks (the default value is 10).
failureThreshold: The number of consecutive times a health check has to fail for Kubernetes to give up on the service instance and restart it (the default value is 3).

Configuring Amazon ECS to use health checks is similar. Let’s look at the following extract from the task definition for the Stock Manager service (located at /jenkins-aws-ecs/task-definitions/stockmanager-task.json):

"healthCheck": {
  "command": [ "CMD-SHELL", "curl -f http://localhost:8030/health || exit 1" ],
  "interval": 10,
  "timeout": 2,
  "retries": 3,
  "startPeriod": 30
}

Amazon ECS leverages the HEALTHCHECK command in Docker to verify whether the service instance is healthy (which is why the command parameter follows that particular syntax), but other than that, the parameters are analogous:

interval: The time between two consecutive health checks
timeout: The maximum time a health check is expected to take
retries: The number of consecutive times a health check has to file for ECS to consider the task unhealthy and restart it
startPeriod: The wait time after the task has been deployed before it starts performing health checks

What these examples show is that, regardless of the technology in place, creating and configuring health checks is an easy and powerful task. Every orchestrating platform will offer the ability to perform health checks in one form or another and follow similar parameters, which means that you should always be able to count on them.

Keep Your Health Checks as Simple as Possible

Ideally, your health checks will have a simple, hardcoded answer. After all, they are just a way to check whether your service is fundamentally operational. You may consider doing some mild checks, but be careful not to overdo it.

Above all, never make a health-check endpoint call another health-check endpoint in another service; the health check applies to your service only, not to dependencies. Your infrastructure will be calling your health-check endpoints regularly; if the implementation of them calls for more health checks, you can have significant traffic dedicated to a “health-check storm.” Moreover, if you ever happen to have a cyclical dependency among your services (which is not so uncommon), your health checks might enter into an infinite loop that can bring your entire system down, and you’ll have to suffer the irony of having an unhealthy system because of badly designed health checks.

Deployment Strategies

Now that you know how to deploy services to production, and how to verify that those services are working as expected (or restarting them if they are not), it is time to decide how you coordinate the removal of an old version of a service and its replacement with a new version.

This is another of those concerns that didn’t exist before continuous delivery. At the time when services were deployed manually, organizations chose a time when the application was meant to have the least active users, typically during the weekend or overnight, and informed everyone with sufficient time that the system would be unavailable due to maintenance during a particular time window. During this window, the operations team would bring down the old version, deploy the new one, and check that everything was working correctly.

Again, now that you are continuously deploying new versions into production, you cannot simply assume that a deployment will imply downtime, since this will leave you with a system that has one section or another down at almost any given time. You need to come up with new deployment strategies that take into account how much impact to the system you can tolerate during a deployment, and how many resources you are willing to dedicate to keep that impact at bay.

This section introduces six strategies to accomplish this in different ways. The philosophy behind each strategy is hinted by its name: single target, all-at-once, minimum in-service, rolling, blue/green, and canary. However, in order to describe them and compare them with ease, a common set of terms will be introduced first:

Desired number of instances: This is the number of service replicas that are expected to be running whenever the service is fully operational. If you take this number to be n, this means that in any deployment, you will go from having n instances of an old version of a service, to having n instances of the new version of that service. We will refer to this simply as desired.
Minimum number of healthy instances: As old service instances are taken down and new ones brought up, you may want to state that there is always a minimum number of them, either old or new, in a healthy state. This can be done to ensure a minimum level of service. We will refer to this as simply minimum, and it can usually be expressed as either a percentage of the total or an absolute number, depending on the platform.
Maximum number of instances: Sometimes you may want to start the new service instances before taking out the old ones so you can limit the gap in the service. This implies a higher utilization of resources. By setting a hard limit on the maximum number of instances, you also set a maximum on the utilization of resources during a deployment. We will refer to this as simply maximum, and, again depending on the platform, it can also be expressed as either a percentage, indicating how many additional instances are allowed (e.g., if maximum was set to 100%, that would mean that you allow the number of instances to double during deployment), or an absolute number (this would indicate how many extra instances the platform is allowed to create).
Graphical representation: For each strategy, we will show a diagram depicting the succession of events while the deployment is taking place. Light squares will represent old versions of the service instances, while dark ones will represent new ones; striped squares indicate new version instances that are in the process of starting up, and therefore not available yet. Each row will represent a snapshot at a particular point in time, with older snapshots at the top and newer ones at the bottom.

Most orchestrating platforms provide some means to indicate these values in one way or another. In the case of Kubernetes, you can add an additional strategy section to the service definition, where replicas indicates desired, maxUnavailable indicates the opposite of minimum ( minimum = 100% - maxUnavailable), and maxSurge indicates maximum:

spec:
  replicas: 5
  type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  [...]

In the case of Amazon ECS, you can specify this whenever you create or update a service that is based on a task definition:

# Creation
aws ecs create-service  
    --desired-count 5 
    --deployment-configuration 
        'maximumPercent=25,minimumHealthyPercent=25' 
    # other parameters

# Update
aws ecs update-service  
    --desired-count 5 
    --deployment-configuration 
        'maximumPercent=25,minimumHealthyPercent=25' 
    # other parameters

Regardless of your platform, once you know how you can set up the ergonomics of your deployment, you can then explore the different strategies and pick the one that suits you better. Let’s go through all of them.

Single target deployment

This is the simplest of the strategies and the one that requires the fewest resources. In this case, you assume the service has only one running instance, and whenever you need to update it, you will have to take it down and then deploy the new one. This implies that there will be a gap in the service, but no additional resources will be needed. The values that define this strategy therefore are as follows:

desired: 1
minimum: 0%
maximum: 0%

This is represented in Figure 10-9 with the following steps:

Beginning: One instance of the previous version exists.
Step 1: The one instance has been substituted by another of the new version; the service is effectively unavailable while this new instances starts up.
End: The new instance is now up and running, available for requests.

All-at-once deployment

This strategy is similar to the single target deployment, with the only difference that, instead of having a single instance, you may have any fixed number of them. When a deployment is needed, all the current instances are taken down, and once they are down, all the new ones are brought up. As in the previous case, there is no additional need for resources during the upgrade, but there will also be a gap in the service. The parameters for this case are as follows:

desired: n
minimum: 0%
maximum: 0%

This is represented in Figure 10-10 with the following steps:

Beginning: Five instances of the previous version exist.
Step 1: All five instances are taken down at the same time and substituted with five instances of the new version of the service; the service is effectively unavailable while the new instances start up.
End: The new instances are now up and running.

Minimum in-service deployment

The two previous instances have one major inconvenience: they both imply a service gap. You can improve this by tweaking your strategy and ensuring that there is always a minimum number of healthy instances. This way, instead of bringing down all the old instances at once, you bring down only a number of them and create new instances when they are gone. Once the new instances are up and running, you can remove another batch of old instances, substituting them with new ones. You can repeat this process until all old instances have been replaced by new ones.

This process prevents a gap in the service without the need of additional resources, but it does mean that the minimum in-service instances must take an additional hit of traffic to compensate for the fact that there are fewer of them; make sure not to set this limit too low, or the remaining instances may not be able to cope with the load.

For the case represented in Figure 10-11, the parameters are as follows:

desired: 5
minimum: 40% (or 2, if expressed as absolute number)
maximum: 0%

The process can be described as follows:

Beginning: Five instances of the previous version exists.
Step 1: Since at least two instances have to be always operational, only three are taken down and substituted by new instances; deployment remains in the step while the new instances start up.
Step 2: Once the new instances are operational, the remaining two old versions can be taken down and substituted with new versions.
End: All the new instances are now operational.

Rolling deployment

A rolling deployment can be seen as a different form of minimum in-service deployment: the focus isn’t placed on the minimum number of healthy instances, but the maximum number of absent instances. The most typical case of a rolling deployment sets this maximum at one, meaning only one instance can be in the process of being updated at any given time. This means that one instance will be brought down, and then a new one brought up; and only when the new one is operational will we continue the process with the next one. In some cases, a rolling deployment may set higher limits, allowing for two, three, or more instances to be in transition at any given time.

As a variant of the minimum in-service deployment, the rolling deployment presents pretty much the same characteristics: it prevents service gaps without the need for additional resources. The main advantage with regards to minimum in-service is that, by limiting the number of absent instances, you limit the extra strain that the remaining instances may have to endure; the main disadvantage is that deployments will take longer and, depending on the number of instances, the startup time, and the rate of redeployment, the platform could end up with a batch of queued redeployments.

For the case of a rolling deployment, the defining parameters are as follows:

desired: 5
minimum: 80% (or 4, if expressed as an absolute number)
maximum: 0%

Note that the rolling deployment is equivalent to a minimum in-service deployment where minimum = desired - 1.

This process is shown step-by-step in Figure 10-12:

Beginning: Five instances of the previous version exist.
Step 1: One instance is taken down and substituted with a new one; deployment remains in the step while the new instance starts up.
Step 2: After the new instance created in step 1 is operational, another old instance is taken down and substituted with a new one; deployment remains in the step while the new instance starts up.
Steps 3, 4, and 5: The same process is repeated for the remaining old instances.
End: All the new instances are now operational.

Blue/green deployment

Blue/green deployment is one of the most popular strategies in the realm of microservices deployments, and perhaps for this reason is one whose meaning is not entirely set. The available literature seems to refer to two slightly different strategies under the same name, each solving a slightly different problem.

The first version of the blue/green deployment aims at fixing one of the disadvantages of both minimum in-service and rolling deployments: during the upgrade, there will be fewer healthy instances to share the total load, potentially causing a strain on them. To address this, blue/green deployments create new instances first, and only when these are available does it starts to take down the old ones. The number of healthy instances never goes below the desired total, but it does mean that we will need extra resources during the upgrade. Considering the standard parameters, a blue/green deployment described this way would have the following:

desired: n
minimum: 100%
maximum: m% (where 0 < m ≤ 100)

Setting m to a higher value will exacerbate the peak use of resources, but will also shorten the length of the deployment.

The second version of the blue/green deployment goes beyond this, though, and cannot simply be obtained by a combination of the desired/minimum/maximum parameters. There is another disadvantage in the previous strategies: during the deployment, there will be a mix of old and new versions of the application in production. This might be OK in many scenarios, particularly if one is being careful with the release of functionality (see “Releasing Functionality”). However, when this mix is to be avoided, the second version of blue/green deployment adds a twist: no new instances will be available to the users until all of them are ready, and at that moment all old instances will be made immediately unavailable.

The way to achieve this is by manipulating the routing of requests, in addition to the orchestration of services; this is displayed graphically in Figure 10-13 via the following steps:

Beginning: A number of instances of the previous version exists (two in this example). The load balancer/router, represented in the graph by a cylinder, is configured to send incoming requests to the old versions of the service.
Step 1: New instances are created, aside from the old ones. These instances are not visible to the public, and the load balancer/router still sends all incoming traffic to the old instances. Deployment remains in this step while the new instances start up.
Step 2: The new instances have finished starting up and are now operational, but no traffic is being directed to them yet.
Step 3: The load balancer/router is reconfigured so all incoming traffic is directed to the new versions of the service. This switch is meant to be almost instantaneous. No new requests are sent to the old versions, although existing ones are allowed to finish.
End: After the old instances are no longer useful, they are taken down.

As can be deduced, the second version of the blue/green deployment provides the best user experience, but at the cost of added complexity and high resource utilization.

Canary deployment

Canary deployment is another case that cannot be achieved by tweaking the combination of desired/minimum/maximum parameters. The idea of the canary deployment is to try out a new version of the service without fully committing to it. This way, instead of completely replacing the old version of an application with the new one, you simply add an instance to the mix with the new version. The load balancer placed on top of the service instances can divert some traffic to the canary instance, and by inspecting logs, metrics, etc., you can understand how this behaves. If the result is satisfactory, you can then perform a full deployment of the new version. A canary deployment is performed in two steps, as shown in Figure 10-14:

Beginning: Multiple instances of the previous version exist (four in this example).
Step 1: An instance of the new version is created, without removing any of the old ones.
End: The new instance is now up and running and can serve requests together with the old ones.

Canary deployments involve a number of challenges when it comes to orchestration. For instance, when the platform is checking the health status of the different instances, it needs to treat the canary one differently from the rest: if the canary instance is unhealthy, it needs to replace it with another canary instance, but if any of the other instances is unhealthy, then it needs to be replaced with a noncanary version. Also, canary instances sometimes need to run for relatively long periods to properly understand the effects of the change under study, during which time you are probably making new versions of the normal instance, triggering redeployments that need to update the normal instances while leaving the canary one intact.

Fortunately, the use case for canary deployments is rather slim. If you just want to expose a new functionality to a subset of users, you can do this by using feature flags (see “Feature Flags”). The added value of a canary deployment is testing out deeper changes that cannot be hidden via feature flags, like a change in the logging or metrics framework, an alteration of the garbage-collection parameters, or the trial of a newer version of the JVM.

Which deployment type should I choose?

As has been shown, the different deployment strategies provide solutions to different problems at the cost of additional resources and/or complexity, meaning each of them will be better suited for different scenarios. Teams will have to analyze what their needs are and the investments they are willing to make. Table 10-1 provides a quick look at the different things that may need to be taken into account to aide in this decision.

Table 10-1. Summary of the characteristics and costs of the deployment strategies
	Single Target	All-at-Once	Minimum in-Service	Rolling	Blue/Green	Canary
Overall complexity	Low	Low	Medium	Medium	Medium	High
Service downtime	Yes	Yes	No	No	No	No
Mix of old and new versions	No	No	Yes	Yes	No	Yes
Rollback procedure	Redeploy previous version	Redeploy previous version	Halt rollout, redeploy previous version	Halt rollout, redeploy previous version	Switch traffic back to previous version	Kill canary instance
Infrastructure support during deployment	Health checks	Health checks	Routing alteration, health checks	Routing alteration, health checks	Routing alteration, health checks	Routing alteration, weighted routing, health checks
Monitoring requirements	Basic	Basic	Simple	Simple	Simple	Advanced

Beware of Longer Warm-up Times

Java technology has evolved for many years under the paradigm of the big monolithic application, and the performance of the JVM has adapted to it: sections of code that run frequently are detected and compiled to native code from bytecode by the JIT (just-in-time) compiler, object creation and destruction patterns are recognized and the ergonomics of the garbage collector adjusted accordingly, etc. However, this can work only when the application has been running for a certain period of time and the JVM has had the chance to gather statistics and make calculations.

After adopting continuous delivery, you will deploy your applications much more often, and any statistical information that the JVM may have gathered during previous executions will be lost. Upon each new deployment, the JVM will have to start from scratch, which might affect the overall performance behavior of your application.

Moreover, if you have too many instances running in parallel, each of them will receive only a small portion of the total traffic, which means it will take longer to have run enough cycles so as to detect patterns. If you combine this with the frequent redeployments, you might have applications that never get to peak performance. If warm-up time is a problem for you, consider using some advanced features like CDS (Class Data Sharing) or AOT (ahead-of-time) compilation; Matthew Gilliard has written a good article about it.

Working with Unmanaged Clusters

So far, we have assumed that we are working with a managed cluster—a cloud platform that keeps track of the totality of servers where our applications run, as well as the different instances of our applications within those servers. This is certainly the recommended approach, for it removes all the burden from the team, allowing it to focus on building useful functionality; unfortunately, teams don’t always have this option.

If your cloud platform doesn’t manage the cluster for you, or if you don’t have a cloud platform as such but rather a set of machines, virtual or physical, that you can use for your production environment, then you need to keep track of what application is running where. The mechanism to do this will be different depending on the technology at hand and on the deployment strategy of your choice, but some general guidelines can be applied.

First, you will need a dedicated database to record what is running where. This database will have at least the following tables:

Servers: Indicating the actual machines (virtual or physical) that exist, together with available resources in each of them (memory, CPU share, etc.).
Applications: Details of all the different applications, together with the parameters that indicate their running configuration (number of instances, memory to be allocated to each instance, CPU share per instance, etc.).
Instances: Details of each single running instance, including the application it belongs to, the server where it is deployed, its health status (deploying, running, failed, etc.), and the parameters used when it was deployed, as these might be different from the new parameters for the application (memory, CPU share, etc.).

Accounting for Resources Once

You might be able to specify only the total resources in the servers table, and deduce the available resources by adding up the resources used by all the instances in that server. Whether you want to have that value precalculated or calculate it on the fly every time is up to you.

Second, you need control over the routing logic. It doesn’t matter which mechanism you use for this, whether you have a configurable load balancer, or edit DNS entries on the fly, or anything else. What matters is that you can decide, whenever an external request comes for your application, the actual instance or instances that the request can be directed to.

Finally, you will need to write your own application to manage deployments. Ideally, this will take the form of one or more command-line scripts, where you just need to indicate the application to be deployed, a pointer to the new version (a version number, for instance), and the new parameters for deployment (if any):

deploy <application-id> <application-version> [deployment-params]

Designing your deployment application this way will allow you to hide the complexity from the CI/CD pipeline and from the development activity itself, while also letting you create more and more sophisticated deployment strategies over time.

Deployment Scripts Are Code

Because the deployment scripts aren’t part of the core of the business functionality that your team is providing, it may be easy to forget that these scripts are also code, and that they need to be treated like that. Make sure changes to deployment scripts are appropriately recorded in your CVS, apply the same coding standards (like pair programming or peer review), prioritize changes through your backlog, and, above all, test your changes! Consider having an entirely different cloud environment where you can try out new versions of your scripts without impacting production.

A generic strategy

Regardless of your deployment strategy, a deployment will be composed of a combination of three types of action, each potentially executed more than once and not necessarily in this order:

Deploying new instances
Bringing down old instances
Rerouting

For instance, the “Rolling deployment” strategy will imply a succession of the following:

Update routing to exclude one of the existing instances.
Bring down that instance.
Deploy the new instance.
Update routing to include the new instance.
Repeat for all other existing instances.

Similarly, the “Blue/green deployment” strategy will imply the following:

Deploy all the new instances.
Update routing to point to the new instances.
Bring down all the old instances.

By breaking down the deployment logic into these three types of action, you can focus on each of them separately, knowing that you just need to combine them in different ways to achieve the different deployment strategies. Let’s now cover the three actions in detail.

Deploying a single instance

Deploying a single instance is the act of having a new service up and running at a particular location; it doesn’t concern itself with other instances that may or may not need to be deployed or taken down, and it doesn’t concern itself with making this particular instance available to the world. It’s just a matter of having a new instance up and running and ready to serve requests.

The actual process may be different, depending on your specific needs, but these general steps can work:

Check the resources that are needed for this instance by looking up the relevant application details in the Applications table.
Locate a server with the capacity to run this instance (and ideally one that is not already running another instance of the application) by looking in the Servers table; if none is available, the deployment fails.
Update the entry for the server to account for the resources to be used by this instance.
Create a new record in the Instances table to account for the new instance; mark its health status as Deploying.
Copy the application to the server and start it.
Poll the health endpoint of the new instance regularly until it provides a healthy status; this will let us know that the application is ready to serve requests.
1. If the health endpoint doesn’t return a healthy status within a configured period of time, abort the deployment. Update the record in the Instances table to Failed (or just delete it) and restore the available resources in the appropriate record of the Servers table.
2. (Optionally) Retry deploying the instance into a new server up to a maximum number of times.
Update the record in the Instances table to mark the health status as Running.
The deployment is finished.

Bringing down a single instance

Analogously to the previous section, here we will just cover the generic steps to bring down a particular instance. Again, your particular needs might vary, but this can be a good base:

Update the relevant record in the Instances table to mark its health status as Undeploying.
Send a signal to the application to gracefully close (most frameworks support this). This will make the application stop accepting new requests, but will wait until existing ones have finished.
Poll the health endpoint until it no longer provides a healthy response; this will indicate that it has successfully closed down.
1. If the health endpoint continues to provide a healthy response past a configurable time-out, abort the removal and mark the instance in the Instances table as Failed-to-undeploy.
Remove the application binary from the server to avoid clutter building up.
Look up the relevant record in the Instances table to gather how many resources were being used by this particular instance.
Update the relevant record in the Servers table to indicate the extra resources that are now available.
Update the relevant record in the Instances table to mark it as Undeployed (or simply delete it).

Transactions and Data Constraints in Your Deployments Database

We can’t emphasize this enough: your deployment application is a production tool, and it needs to be treated as such. In the previous steps, we’ve described multiple changes to the database; make sure these are executed within a single database transaction. Also, consider adding data constraints to make sure you never over-allocate resources to any server or that you don’t allocate two instances of the same application to the same server.

Rerouting

This action is the most dependent on your particular routing technology, and therefore the one that we can say the least about. The key thing to keep in mind is that the configuration change to move from the old to the new routing details needs to be atomic: after you provide a new routing configuration, no request can be directed using the old details. Make sure your load balancer or routing technology of choice can support this.

Changing Databases

Even though services seem to attract most of the attention when talking about continuous delivery, the fact is that information usually ends up being stored in a database. Therefore, if continuous delivery requires the ability to change services at a constant pace to keep up with business needs, it also requires the ability to change your databases accordingly.

If you are working with a NoSQL database, like MongoDB, there isn’t much to worry about: the database isn’t restricted by a data schema as such, and therefore any changes to the data structure or to data itself can be performed in the application code. If, on the other hand, you are working with a standard SQL database, there are a few things that you need to look after.

Managing database deployments

Changing a database needs to be seen under the same light as changing an application: like an automated and tested process within a CI/CD strategy. This implies that things like a schema modification or a data migration cannot be done manually or in isolation, but rather must be done as part of a CVS-registered change that triggers a build in our CI/CD pipeline. Moreover, you need to recognize that, as the needs of the business evolve, so does the optimum data schema. Therefore, you need to be comfortable with the idea of refactoring databases, just as you refactor application code.

Multiple tools can help with this, and DBDeploy, probably the first tool designed to ease database changes as part of a CI/CD pipeline, is worth highlighting. Although DBDeploy still does its job, it’s fair to say that development has been somewhat abandoned (the last commit in the DBDeploy repository as of the time of writing this book dates to 2011), so new adopters should probably look at more recent alternatives like Flyway or Liquibase. In either case, all these tools work in a similar way:

Changes to the database, either structure or data, are performed via migration scripts. These scripts are standard SQL files.
Each migration script must have a unique name and sequence number.
The migration tool keeps a table for itself in which it keeps track of which migration scripts have been run in the database and which ones haven’t.
Whenever the migration tool is invoked, it will scan the totality of available migration scripts, compare it with the ones that have already been run against a particular database, identify the ones that haven’t been run, and then run those.

This process allows you to keep track of changes across multiple environments, so you can try out a database schema change in test without affecting production. It also gives you a history of changes, and lets you jump to any particular version of the database at any point in time: you just need a blank database and then run the migration scripts up to your desired point.

Undoing or amending changes is a bit tricky, though. It may be tempting to think that, if you identify a bug after testing a particular migration script in the test environment, you can just amend that particular migration script and rerun it. The truth is that the migration tools don’t usually understand changes in a migration script, they consider them only as either “already run” or “not yet run”; if a particular script has already been run against a database (for instance, staging), then an amendment to the script won’t trigger a rerun: the tool considers it has already processed it, and therefore skips it. In these situations, the best option is to wipe out the database every time so that all the scripts are always run (definitely not a good idea for production), or simply add further migration scripts that undo or amend the previous ones.

Dealing with Long-Running Migrations: Abraham’s Experience

The beauty of young projects is that they rarely have truly large amounts of data to deal with, which means database migrations run quickly and nicely. Fast-forward five or ten years, and the story will be quite different. Try adding a new column to a table that has hundreds of millions of records; even the best databases can take a good 30 minutes to cope—not the sort of time span that you have in mind when you use the phrase “tight feedback loop.”

This is not the worst that I have seen, though. The main problem with these long-running migrations is that the migration tool itself, or the CI build invoking the migration tool, might time out halfway during the migration. This leads to a catastrophic outcome: the order sent to the database will eventually succeed, performing the migration, but the migration tool will not record it on its script-keeping table. In other words, the migration has been run (or at least partially run), but the migration tool hasn’t acknowledged it, which means it will try to run it again the next time it’s invoked.

I have had the (mis)fortune of dealing with my fair share of database migrations, and have learned a few tricks along the way. To overcome these limitations, you need to play dirty with the migration tool.

First, you have to understand how these tools work internally. For each migration script, the tool will analyze how many individual statements it has and will prepare to run them in sequence. After all of them have been run successfully, the tool will add a record on its script-keeping table to flag it as installed. Therefore, if one of these statements takes long enough to cause a time-out, I know that all statements up to and including the offending one will (eventually) run successfully, but any statements after it won’t. So here is the trick:

Whenever I suspect a statement might take too long, I create an individual migration script for that statement alone. If multiple statements can cause trouble, I create an individual migration script for each of them.
I prepend each long-running statement with a data update statement that will add a record to the script-keeping table, marking this script as completed.
I then append after the long-running statement another data update statement that removes the record that I just added to the script-keeping table.

Depending on the actual data schema of your tool of choice, the final migration script will look something like this:

INSERT INTO SCHEMA_VERSION (SCRIPT_NAME, SCRIPT_NUMBER)
    VALUES ("23_long_running_change", 23)

-- long-running change

DELETE FROM SCHEMA_VERSION WHERE SCRIPT_NUMBER = 23

Now, if the long-running change does cause a time-out, the last DELETE statement won’t be executed, and the migration tool won’t attempt to record this migration script as installed. However, I already did that myself with the first INSERT statement, meaning my database will remain in a consistent status. The build in the CI/CD pipeline may appear as failed once, but the next time it runs, it will work with no problem.

If, on the other hand, the long-running change does not cause a time-out, then the DELETE statement will be executed, leaving the script-keeping table clean and letting the migration tool itself update it. Either way, writing the migration scripts this way will ensure that the database is in a consistent state at all times.

Separating database and application deployments

One of the requirements for a continuous delivery process is minimizing the impact of potential failures so as to allow for continuous changes. This is why you should always try to avoid multi-application deployments: each application should be deployable independently of others. The same is true for database and application deployments.

Whenever you prepare a migration script, you should make it so it can be deployed without breaking any compatibility with any running applications, for the following reasons:

If multiple applications use your database, there is no guarantee that all of them will be deployed at the same time (an individual deployment may fail at any time); if this were to happen, the applications that haven’t been upgraded to the next version will fail to communicate with the database from now on.
Even if only one application uses your database, you could have a mix of old and new instances of your application, depending on your deployment strategy (e.g., rolling deployment); the new instances will work while the old ones won’t, defeating the purpose of a deployment with zero downtime.
Finally, even in the extreme case where only one application uses your database and you don’t use a deployment strategy that mixes old and new instances (e.g., all-at-once deployment), there is no guarantee that database and application will be deployed at exactly the same time, providing a window of failure.

Regardless of your case, you need to acknowledge that your database is an independent component that has its own deployment cycle, and respect the relationship with connecting applications. For changes that can potentially break compatibility, take a look at “Multiple-Phase Upgrades”.

A different matter is whether the migration scripts need to be on their own repository. This is a debating point, and different individuals may lean toward different solutions. In general, if a database is used by multiple applications, and each of these applications has its own repository, then it is advisable that the migration scripts have their own space, too. If, on the other hand, the database is used by only a single application, or it is used by a set of applications that are all managed together in the same repository (a monorepo), then it is OK to include the migration scripts in the same repository.

Communicating via stored procedures: turning the database into just another service

Stored procedures have somehow grown out of fashion over the last few years. Many developers today tend to associate them with bureaucratic organizations in which everything that is database-related is managed by DBAs, and developers are allowed to access data only via a set of rigidly designed stored procedures. This has the unfortunate effect of blaming the messenger: if appropriately managed, stored procedures can be a great ally to provide a faster development environment, and can even let you look at your databases in a new, imaginative way.

The main reason a microservices architecture works is that it provides the right balance between exposure and encapsulation. Each service will hide its internal workings to others, and will allow communication only via a number of known endpoints. In the same way, you can look at your database as just another microservice where the stored procedures are the known endpoints, and where communication is performed via SQL as opposed to HTTP. Knowing that connecting applications access the database only via stored procedures, and assuming that you keep the behavior of the stored procedures constant, you can perform as many refactorings to the internals of the database as you want without affecting any applications.

This, of course, requires that your stored procedures are appropriately tested, just as you test the endpoints of your services. Testing databases is out of the scope of this book, but from a Java point of view, the best tools to consider are DbUnit and Unitils.

Releasing Functionality

The previous section covered the mechanisms and strategies that allow code changes to make the journey from the pipeline to production. However, as we said before, that’s only one side of the story. The fact that you are now able to continuously push new changes to production doesn’t mean that you should expose your users to a constant stream of changes; users tend to like a rather stable and predictable experience, and there is only so much change they can tolerate. On the other hand, some changes may make sense only when grouped together with others, but you don’t want to revert back to an old-style big-bang deployment where all these changes are introduced in one go. You need a mechanism to decide which features you expose to users that is orthogonal to the deployment mechanism.

On the other hand, you must not forget that it’s not just users that you provide functionality to. In the world of microservices, there is a lot of service-to-service communication in the form of RESTful API calls, and sometimes you may need to make changes to these APIs to enable new functionalities. This is another instance where change cannot be brought about without further thought, since modifying the way an endpoint works may impact some of the client applications using it, and this, in turn, could create cascading effects onto ulterior services.

It is therefore essential that you adopt a set of practices, independent from the ritual of deploying services, that allow you to control the way in which changes that might affect other entities are introduced, so you can communicate with the teams or organizations responsible for those entities and make the necessary arrangements. That’s what we will cover in this section.

Service Meshes: The Future of Releasing Functionality?

Since late 2016, there has been increasing interest in service meshes, a dedicated infrastructure layer for making service-to-service communication reliable, secure, and fast. Open source projects and commercial products have emerged in this space, such as Linkerd, Envoy/Istio, Cilium, and Consul Connect. As we go to print, the practices around working with a service mesh are still being developed, so we won’t offer guidance here. However, we encourage you to keep up-to-date with this interesting space, particularly if you are working with containers. Thought leaders in this space who also have an interest in Java include Christian Posta and Burr Sutter from Red Hat. Both blog extensively, and have also published an O’Reilly report, Introducing Istio Service Mesh for Microservices, that is well worth reading.

Feature Flags

Feature flags are essentially configuration options that determine whether a particular functionality or feature should be exposed to the user during a given request. Since they are just configuration options, they can have different values in different environments, meaning you can give access to all the new features in the test environment while hiding them in the production environment until you are ready for full rollout. Also, like any other configuration option, you can construct them so they can be modified without redeploying the service (see “Managing Configuration and Secrets”).

There are several ways to implement feature flags, but they mostly come in one of three flavors:

Binary flags: The flag can have a value of true or false, effectively enabling or disabling the functionality. This is the simplest of flags.
Throttle flags: The flag represents the percentage of requests that should use the new feature, with 0% being equivalent to a disabled binary flag, and 100% equivalent to an enabled binary flag. For values in the middle, you can generate a random number between 0 and 100 for every request, and provide access to only those where the generated number is below the flag value. Implementing throttle flags carries a little more complexity than binary flags but allows for the new feature to be released gradually.
Category flags: While throttle flags give control over the number of users exposed to the new feature, they don’t give control over which particular population is exposed; this can be achieved with category flags. Given a particular property attached to each incoming request, a category flag includes the subset of possible values for that property that should grant access. In other words, access to the feature is provided if the value of the target property in the incoming request is within the configured set of values of the category flag, and rejected otherwise. Category flags are a bit harder to implement but provide the finest form of control. For example, if you are offering some kind of commercial deal for which you need legal approval, you could be opening the feature only to visitors from countries where approval has already been obtained. Similarly, if you have some kind of beta-tester user program, you could open experimental features only to affiliated users.

The Extended Java Shop includes a fully implemented example of feature flags. As mentioned before, this application represents a digital shop where different mechanical parts can be purchased. Prices are set and managed statically at the Product Catalogue service.

Let’s assume that we want to trial a new Adaptive Pricing service, provided by a third party. This Adaptive Pricing service promises to calculate in real time the optimum price for a product, taking into account overall stocks in different providers, demand, etc. The idea is that, by using the Adaptive Pricing service, we might be able to automatically adjust the price of our products and increase profit margins. The third party charges us a fee for every single time they successfully provide a price in response to one of our requests, so we’d like to limit the number of calls that we make until we are sure that this service is worthy. Also, we are unsure of how users will react to these variable prices, so we want to limit any potential discontent.

The best way to tackle this dilemma is with a throttle flag. The Product Catalogue service provides to the Shopfront service the static prices as managed in our inventory, and then the Shopfront service decides whether to use that price or query the Adaptive Pricing service for a new one. In our case, we have created a service to manage feature flags, so the Shopfront service has to query the Feature Flags service on every request to get the current value of the flag, and then needs to decide whether the current request fits.

The Feature Flags service can be found under the folder /featureflags of the Extended Java Shop repository. The section within Shopfront that uses this flag to determine the price to use can be found in the ProductService class, although the most relevant parts are displayed in Examples 10-5 and 10-6 for convenience.

Example 10-5. Using a feature flag to decide on-the-fly whether an adaptive price should be used instead of the original one

// Check value of flag, if it applies, attempt to get adaptive price
private BigDecimal getPrice(ProductDTO productDTO) {
    Optional<BigDecimal> maybeAdaptivePrice = Optional.empty();
    if (featureFlagsService.shouldApplyFeatureWithFlag(ADAPTIVE_PRICING_FLAG_ID))
        maybeAdaptivePrice = adaptivePricingRepo.getPriceFor(productDTO.getName());
    return maybeAdaptivePrice.orElse(productDTO.getPrice());
}

Example 10-6. Mechanism to decide whether a given throttle feature flag should be applied

// Get value of flag and check if a randomly generated value falls within
public boolean shouldApplyFeatureWithFlag(long flagId) {
    final Optional<FlagDTO> flag = featureFlagsRepo.getFlag(flagId);
    return flag.map(FlagDTO::getPortionIn).map(this::randomWithinPortion)
        .orElse(false);
}

private boolean randomWithinPortion(int portionIn) {
    return random.nextInt(100) < portionIn;
}

For Your Everyday Rump-Up: Smart Throttles

If you use throttle feature flags often, and you are in the habit of increasing the value of your throttle at a gradual pace until you reach 100%, you might find it tedious having to update the value of the flag every day. If that is the case, you can try with a smart throttle, which is just a fancy name for a throttle flag that automatically increases its value on a daily basis (or any other frequency that suits you). Your exact needs might vary, but something like this can do the trick:

 public class SmartThrottleFlag {
    private int initialPortionIn;
    private LocalDate startDate;
    private int dailyIncrement;

    /* Constructor goes here */

    public int getPortionIn() {
        final LocalDate now = LocalDate.now();
        if (startDate.isAfter(now)) {
            return 0;
        }

        long daysPast = DAYS.between(startDate, now);
        long totalIncrement = daysPast * dailyIncrement;
        long currentPortionIn = initialPortionIn + totalIncrement;
        return Math.min((int) currentPortionIn, 100);
    }
}

Semantic Versioning (semver)

In the current world of microservices, the most common way to make shared functionality available across codebases is quickly becoming the creation of a new service for that functionality. Sometimes, however, you still might find it useful to create libraries of shared functionality, especially for syntactic sugar constructs, and you need to be careful about how these libraries evolve.

As introduced in Chapter 5, Semantic Versioning, or semver, is a set of rules that let you know the extent of a change in a library just by checking its version number. In its simpler form, version numbers that follow semver have three numbers separated by dots: MAJOR.MINOR.PATCH. When a new version of a library, framework, or tool is released, only one of these three numbers is allowed to increase, typically by a single unit. The number that is increasing will tell you the scope of the change:

MAJOR: The new version introduces backward-incompatible changes; using the new version might break client code at compilation and/or runtime. When the MAJOR is updated, MINOR and PATCH are commonly set to zero.
MINOR: The new version introduces some new, backward-compatible features; existing clients should be able to adopt the new version without any impact. When the MINOR is updated, PATCH is commonly set to zero.
PATCH: No new functionality has been added; this new version corrects an existing bug.

Semver allows clients to decide when they are ready to adopt a new version, or even whether they want to adopt new versions automatically. For instance, Maven allows you to provide dependency information indicating a fixed-value version or a range of versions. If you know that the maintainers of a particular library use semver, and you are currently using version v5.0.0 of their library, it would be advisable to write your dependency like this:

<dependency>
    <groupId>com.github.quiram</groupId>
    <artifactId>java-utils</artifactId>
    <!-- square bracket includes the value, curved bracket excludes it -->
    <!-- this is equivalent to v5.0.x -->
    <version>[v5.0.0,v5.1.0)</version>
</dependency>

If you feel adventurous enough, you could even register your dependency to automatically update to the latest minor version by using [v5.0.0,v6.0.0). This shouldn’t ever break your client code (barring mistakes) and would always give you the latest available features. It is not a good idea to automatically upgrade to new major versions, though.

An example of semantic versioning in action can be seen in the external library java-utils. If you take java-utils and explore versions v4.0.0 through to v4.6.0, you’ll notice that each new version simply adds methods to helper classes, which are evidently backward-compatible changes. The next version after v4.6.0 is version v5.0.0, which represents a backward-incompatible change. If you inspect the changes in this new version, you we will see that the meaning of the method ArgumentChecks.ensure(Callable<Boolean>, String) has changed: in version v4.6.0, this method expected a fail condition as the first argument, but in version v5.0.0 it expects a pass condition—it’s exactly the opposite!

The next version after v5.0.0 is v5.0.1, which indicates a bug fix. Indeed, if you inspect the changes, you will see that v5.0.0 inverted the meaning of the aforementioned ensure method, although it didn’t update all the locations within the library where this method was called, breaking some functionality; v5.0.1 fixed this.

What’s with the v?

Sometimes different organizations will push for different sets of best practices for slightly different but often interchangeable aspects of programming; versioning is one of them. While semver advocates for a purely numerical version number, GitHub advocates for prepending versions with the letter v. Both approaches aren’t incompatible, since technically GitHub’s style refers to a tag that points to a specific version, not to the version itself. So in GitHub speak, v1.2.3 is a tag that refers to version 1.2.3 of the code. As a matter of fact, even the semver repository uses the v prefix in its releases.

Some systems and tools may not make a distinction between the two styles, and treat either of them as a version number. This is usually OK, but depending on the build system that you use, this might have the effect of confusing the build tool when using ranges for the MAJOR component of the version (e.g., it might consider that version v10.x.y is earlier than v2.x.y). This is just another reason not to use ranges for the MAJOR component of the version.

Backward Compatibility and Versions in APIs

Semver is an incredibly powerful and yet simple paradigm that can reduce friction between producers and consumers. However, it doesn’t easily apply to APIs. The consumption of libraries and frameworks has a history of using specific version numbers to record snapshots of code, and tools have been adapted to it. As we have shown, you can instruct your dependency management system (for instance, Maven) to grab the latest available version from within a range pattern, but this should not be done trivially when consuming a web service’s API.

The first thing to realize is that, whereas the version number of a library refers to the implementation of the code within the library, the version number of a service refers to the interface. Therefore, a new versioning paradigm is needed, one that doesn’t take into account changes in implementation, but changes in behavior.

Now, the topic of versioning APIs is a rather controversial one, and the development community hasn’t yet agreed on a single best practice. Several options exist, and different people will defend their position passionately. The only thing that can be categorically stated is that some solution is needed. In this section, we present some of the most common options and indicate their pros and cons so that you can make your own informed decision.

Avoid versioning

The first approach to manage versioning in an API is simply to avoid it: if your team needs to make a particular change to the API, make it in a way that keeps backward compatibility. In practice, this means keeping the endpoint URL as it is, and changing the structure of the returned object only to add new fields. Existing clients can continue to use the API, oblivious to the change, while clients who need the new feature have it there available for them.

An example of this approach can be found in the Extended Java Shop—more precisely, in the Feature Flags service. The story goes as follows. The Feature Flags service was initially designed as a throttle flag with three parameters: the flag ID, the flag name, and the rate of requests that should be granted access to the feature (the “portion in”). At some point, the business realized there was a downside to the current way feature flags were managed: since each request was independent from the others, it could happen that the same user was granted access to the feature in one request and rejected in another one, potentially providing an inconsistent experience. Depending on the feature, this might be an acceptable situation, or it may not. To address this, a new functionality was to be added to the Feature Flags service: flags should include a “sticky” parameter, which indicated whether the behavior across requests would be the same for a given user or whether it was allowed to change.

The implementation of the sticky parameter has been done in a backward-compatible way to avoid the need for a new version: a new field has been added to the response. Existing consumers of feature flags, like the shopfront service, can simply ignore this new field until they are ready and/or willing to make use of it.

Ignoring New Fields May Not Be the Default Behavior

Backward-compatible changes implemented this way work only if the client application that is consuming the service ignores any new or unknown fields, which is not necessarily the default. For instance, in the case of acceptance tests in the Extended Java Shop, which uses Jackson to deserialize JSON objects, this has to be explicitly requested by using @JsonIgnoreProperties(ignoreUnknown = true); see the Flag class in the folder /acceptance-tests for details.

A similar approach can be used when you need to modify an existing field: instead of modifying it, you can choose to add a new one with the new meaning. This, of course, can work for only so long, and eventually your API will be littered with a mixed bag of old and new fields. If you get to this point, or if you need to make changes that cannot be implemented by simply adding new fields, then you need to create a new version of the API as indicated next.

Version the endpoint

A simple and effective way to create a new version for a backward-incompatible change in the API is to include the version number as part of the endpoint itself. This way, if the current version of the resources is located under /resource, the new version can be located at /v2/resource. This is a common approach, used in well-known services like AWS, and it’s one that is simple to implement and communicate. You can easily switch from one version to the other by quickly editing the URL. You can give someone a link to easily try out the new version. The pragmatism under this approach is its main advantage.

The Extended Java Shop includes a case of versioning through endpoints in the Product Catalogue service. Let’s say that, in our example, the business has decided to have two different prices for the products: the single price, when the item is purchased in small amounts, and the bulk price, with an implicit discount for large purchases. The Product Catalogue would then also indicate, for each product, the number of units that need to be purchased at the same time to qualify for the bulk price. Developers decide that it would be too messy to implement this new feature simply by adding new fields to the response, and decide to create a new version of the API. Version 1 of the Product Catalogue returns a Product object like this:

GET /stocks/1

200 OK
{
  "id": "1",
  "name": "Widget",
  "description": "Premium ACME Widgets",
  "price": 1.20
}

Version 2 returns a modified object with extra information:

GET /v2/stocks/1

200 OK
{
  "id": "1",
  "name": "Widget",
  "description": "Premium ACME Widgets",
  "price": {
    "single": {
      "value": 1.20
    },
    "bulkPrice": {
      "unit": {
        "value": 1.00
      },
      "min": 5
    }
  }
}

You can see how this has been implemented by looking at the two versions of the ProductResource class in the Product Catalogue service.

This is a backward-incompatible change that has been implemented using the version-endpoint pattern. This way, requests to /products would return the first version, while requests to /v2/products would return the second version. Clients using the first version of the API can continue to operate as normal, but those who want to use the new features can make the necessary arrangements to use the second version. Note that, in our sample application, the Shopfront service is still using version 1 of the API.

Version the content

Detractors of the version-in-endpoint approach usually point out how it breaks the semantics of the RESTful principles: in a pure RESTful API, an endpoint is meant to represent a resource, not a version of a resource, so version numbers should not be included in the URL. Instead of versioning the endpoint, you can version the content by means of the Content-Type header.

Let’s assume your service provides a response in JSON format. The value of the Content-Type header in this situation will most typically be application/json. You can, however, provide a versioned content type using the pattern application/vnd.<resource-name>.<version>+json, where <resource-name> is the name of the resource that this type refers to, and <version> is the version of the resource format. Clients can then indicate the version that they want to be provided by using the Accept header. This way, the same endpoint can serve different versions of the same resource.

An example of this can be found in the Stock Manager service of the Extended Java Shop. In this case, we can say that the business realized that some of the pieces on sale were particularly heavy, and they wanted to impose a limit on the number of them that a customer could buy in a single purchase to ease the packaging and delivery. (Whether it is credible that any business would willingly limit the number of products that they sell is something that we will not debate here—just roll with it.) For this, the development team decided that stocks should include both the total number of available units, and the number of units that could be purchased at the same time. Once again, the developers decided that it would be better to do a new version of the API than try to add the changes to the existing one in a backward-compatible manner, and decided to implement this change by using the version-in-content approach. This way, the old API still worked in the following way:

Accept: application/json
GET /stocks/1

200 OK
Content-Type: application/json
{
  "productId": "1",
  "sku": "12345678",
  "amountAvailable": 5
}

The new API is available by changing the header:

Accept: application/vnd.stock.v2+json
GET /stocks/1

200 OK
Content-Type: application/vnd.stock.v2+json
{
    "productId": "1",
    "sku": "12345678",
    "amountAvailable": {
        "total": 5,
        "perPurchase": 2
    }
}

Details of this implementation can be found in the StockResource class in the Stock Manager service. Note that, in our example, the Shopfront service is still using version 1 of the API.

Don’t Mix Your API Versioning Strategies

Either of the API versioning strategies mentioned here can work, but we advise you to pick one and stick to it. Having both versioning strategies in the same application or system will make it harder to manage and will confuse your consumers. As Troy Hunt said, the whole point of an API is to provide predictable, stable contracts.

Advanced change management

On the far end of the spectrum lies the argument that the mere fact that a RESTful API needs to be versioned is an antipattern, since it doesn’t strictly follow the rules of hypermedia communication. The URIs of our resources should be immutable, and the provided content should use a language that is parsable upon a definite set of rules; this way, changes in the content can just be reinterpreted by the consumer, and no coordination between provider and consumer is needed.

Although strictly true, most people find this approach harder than it’s worth, and revert back to one of the approaches outlined previously. For this reason, we have decided not to cover it in this book, although readers who want to investigate further are encouraged to check Roy Fielding’s work.

You Don’t Need API Versioning Until You Do

After reading this section, you might be compelled to preppend /v1 to any new endpoint that you create, or even to the endpoints that you already have (or, if you have opted for versioning via content type, then change the content type of all your endpoints). The truth is that, even though you need to be ready to deal with API changes, you might not need to deal with them right from the outset: if you’re creating a new service, and you have a lot of decisions to make, the versioning scheme is one that you might be able to delay. Simply assume that a lack of version means version 1, and when (and if) you need to deal with a change, then you can choose a versioning scheme.

Multiple-Phase Upgrades

In the previous examples, we presented some use cases in which a service is to offer a new piece of functionality before clients are ready to consume it, meaning backward-compatibility has to be preserved in some way: either by making a backward-compatible change, or by creating a new version of the API. You may be tempted to think that if you control both the provider and the consumer of the API, you can skip this trouble and simply change both at the same time, but you would be wrong.

Even if you do change both at the same time, there is no guarantee that those changes will be made available in production also at the same time. On one side, your deployment strategy may imply that both the old and the new version of the provider coexist in production during some time; if your client can cope with only the new version, it will experience significant disruption during deployment. On the other hand, and as you will see in Chapter 11, your changes will have to go through multiple test phases, and there is always the chance that those tests pass for the provider but not for the consumer (or vice versa), meaning you’ll have in production mismatching versions of provider and consumer.

The moral of the story is that, regardless of whether you control both sides of the interaction or only one, you will need to perform your actions in numerous steps to make sure both sides don’t fall out of sync. This is sometimes referred to as Expand and Contract, and usually boils down to the following steps:

Create a new version of your API or library, and push the changes.
Let the change make its way through the pipeline. If you are changing an API, make sure the deployment is completely finished and that the new API is available in all the running instances.
Change your consumer(s) to use the new API or library.
If you’re updating an API, and once every consumer has been updated to use the new version, you can consider deprecating the old one.

Deprecating old APIs

Keeping every historical version of an API would be a maintenance nightmare. That’s why, even though you want to make it easier for consumers to adopt new APIs at their own pace, you also want to make sure they do move on.

You can track the number of people using each version of your APIs (if at all) by keeping usage metrics of each of your versioned interfaces (see Chapter 13 for information about metrics). Once you are confident that nobody is using the old versions, either because you know or control all the potential consumers, or because you can see in the metrics reports that there is no usage, you can confidently delete the old versions.

Sometimes you won’t feel in a position to remove the old endpoints straightaway, either because you know there is some usage but you can’t track the owner, or because yours is a public API and you can’t assert with confidence that nobody is using the old API anymore. If this is the case, you might be able to nudge the slow movers by keeping the old interface but removing the implementation: requests to the old API can be replied to with an HTTP redirect instruction:

GET /v1/resource

301 MOVED PERMANENTLY
Location: /v2/resource
Content-Type: text/plain
This version of the API is no longer supported, please use /v2/resource

Chances are, the consumer will still be broken by this, since the request will be redirected to the new version, for which the consumer is probably not ready. However, at least they will be notified of what they need to do to fix the situation.

Managing Configuration and Secrets

In previous sections, we correctly identified how to best manage application deployments and functionality releases as part of a continuous delivery process. There is, however, one last responsibility to be taken care of whenever we consider the evolution of applications onto newer versions: configuration.

In the past, configuration used to be something managed aside from code. Applications would be deployed to servers, assuming that they would be able to find a file at a particular location and that they would contain the different configuration options needed by the application. Changes to the configuration would be controlled by a separate process and different tools, typically known as Software Configuration Management (or SCM). Quite frequently, it would even be different people who handled code and configuration.

However, the dynamic environment that we have showcased in this chapter makes managing configuration in this way impractical. New computer instances might be created and added to your environment at any time, and configuration files would have to be copied there as part of the instantiation. A change in configuration would have to be spread across a large number of computers. And the fact that multiple services could be sharing the same computer instance presents us with the real possibility of a configuration clash. A different way is needed.

This section indicates the most common ways to manage configuration in the world of microservices and continuous delivery, indicating the pros and cons of each approach.

“Baked-In” Configuration

The simplest way to configure an application is to pack the configuration file with the application itself. What’s more, you can keep the configuration file in the same repository as the code, which allows you to keep track of changes to configuration. Operating this way means you don’t need to do anything special to ensure your application is configured when deployed into production, which makes it a convenient and appealing option. All the services in the Extended Java Shop make use of baked-in configuration; the Spring Boot-based ones use the file applications.properties, while the Dropwizard-based one (Product Catalogue service) uses product-catalogue.yml.

It might seem that, by following this baked-in configuration approach, you can have only one set of values for the configuration, meaning that you cannot have different configurations for different environments (e.g., test and production). However, your baked-in configuration can include several options or profiles, and then your application can decide to pick one or the other, based on a parameter or variable that the environment in question is making available. An example of this can be seen in the Feature Flags service, which uses Spring Boot’s concept of profiles to keep two sets of configuration values; this is achieved with three baked-in configuration files:

application.properties: Indicates which profile is used by default
application-test.properties: Indicates the configuration values to use according to the test profile
application-prod.properties: Indicates the configuration values to use according to the prod profile

The first file sets the property spring.profiles.active to prod, indicating that Spring Boot should use the contents of application-prod.properties to configure the application. However, this property can be overridden by an environment variable of the same name. This is done in the Acceptance Tests (folder /acceptance-tests). If you inspect the docker-compose.yml file in the Acceptance Tests module, you will notice the following:

featureflags:
  image: quiram/featureflags
  ports:
   - "8040:8040"
  depends_on:
   - test-featureflags-db
  links:
   - test-featureflags-db
  environment:
    - spring.profiles.active=test

The last element, environment, sets the environment variable spring.profiles.active to test. This will override the setting in the file application.properties, and will signal Spring Boot to use the file application-test.properties when bringing up and configuring the Feature Flags service. This way, you can use the baked-in configuration pattern and still keep different configuration sets for different environments.

Externalized Configuration

One of the consequences of baked-in configuration is that, since you are tracking it as just another file (or set of files) within your code, any change to configuration will be treated by the build pipeline like a change in code. This will sometimes be desirable, because a change in configuration might require the execution of your test suite, but in many other cases it will just trigger unnecessary work: for instance, a change in a flag for the test environment will not only trigger the entire suite of sets, but also result in a new deployment to production for something that doesn’t affect the production environment at all.

There is another disadvantage to baked-in configuration: since it is managed as another file in the code repository, only developers can make changes to it. Again, this might be appropriate for some configuration options (like the connection pooling parameters for a database connection). For some other cases, you might want to give that power to different members of the organization. For instance, for the case of managing feature flags, you may want to let the business decide when and how these are tweaked. Another case might be if the infrastructure team provides some resources for your application—for instance, log pools (see Chapter 13): maybe you want certain details to be managed by the infrastructure team, so they can manage and change details of the infrastructure with flexibility, and your application simply reads the configuration from them. Either way, they will be instances in which you don’t want the configuration to live within your own application, but to be provided from outside.

The solution for externalizing configuration will depend on the reason by which you want this to be externalized; you might even want to have multiple solutions for different cases. In the Extended Java Shop, we opted for externalized configuration for the case of feature flags. We could have kept feature flags just as another parameter in the application.properties file of Shopfront service, but we decided to go through the trouble of creating an independent service to make that configuration editable without code changes. If business members are tech-savvy enough, they can make HTTP requests to the Feature Flags service to edit the flags as they need to; if not, you can always create a little GUI that wraps the calls to the service.

Other forms of externalizing configuration could be setting up environment variables in the computer instances where services will run, or creating files in special locations for the application to pick them up.

Remember: You Lose Control of What You Externalize

With baked-in configuration, you can be fairly confident that any configuration item that you need is exactly where you expect it; if not, you can just call it a bug and fix it. However, when you decide to externalize your configuration items, you’re dependent on what other people or teams might decide to do. Feature flags could be deleted without notice; other configuration items may have invalid or unexpected values in them. Make sure your application can cope with all these scenarios.

Handling Secrets

Some configuration items are especially sensitive. We are talking, of course, about secrets of different kinds: database passwords, private keys, OAuth tokens, etc. We obviously can’t include these in the baked-in configuration, but even the externalized configuration may need special treatment to make sure the values are secure.

Keeping secrets private while still making them available where they are needed is a surprisingly difficult task. Our best advice is that you don’t try to create your own solution, at least not from the outset. All orchestrating platforms offer some kind of secret management with different degrees of privacy, depending on how hard your needs are.

For instance, you might not mind if everyone in your team or organization knows the values for these secrets; you just don’t want to record them as plain text anywhere (least of all in your code repository). In this case, you can use tools like Kubernetes Secrets. Kubernetes allows you to create secret keys and give them a name to identify them. These keys are then stored safely by Kubernetes. You can configure your applications to consume these keys, and Kubernetes can make them available to them as either files or environment variables. When your application tries to read these files and/or environment variables, the keys will already have been decoded, and your application will be able to use them normally.

AWS provides a similar way to handle secrets via Parameter Store, which is part of Systems Manager. Parameter Store is integrated with Amazon ECS, meaning your applications can use it easily. Other platforms will have other similar features; check out the relevant configuration to know more.

Summary

In this chapter, we covered the aspects surrounding the last section of our continuous delivery pipeline: automated delivery. As it usually happens, the last mile is the hardest one, and extra challenges appear when you want to set up an environment where you deliver changes at a constant pace:

Deployments and releases are two different concepts. The former is the technical activity of bringing a new version of an application to production, while the latter is the business activity of allowing users access to functionalities.
Different deployment strategies are available, each with a different profile of advantages, resources needs, and complexity. There is no right or wrong strategy, but strategies that are more or less adequate to your needs.
You may need to choose when and how to make new features available to your consumers, either end users or other teams. Feature flags allow you to gradually open features to the public, while versioning schemes for libraries (like semver) and for APIs (like version-in-endpoint or version-in-content) allow your consumers to adopt new features at their own pace.
With the increase in the number of moving parts, configuration needs to become a first-class citizen. Baked-in configuration is easy to handle and track, but accessible only to developers. Externalized configuration allows other people to manage configuration details, but it can become less reliable. Secrets, like passwords or keys, need special support from the orchestrating platforms or sophisticated purposed-built solutions.

Thanks to the last two chapters, you can now build an end-to-end automated pipeline that can bridge the gap from the development station to production. In the next chapter, we will explore what you need to add to that pipeline to ensure that your changes don’t introduce any regressions or unintended consequences.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10. Deploying and Releasing from the Pipeline

Create new playlist

Sign In

Sign Up

Chapter 10. Deploying and Releasing from the Pipeline

Introducing the Extended Java Shop Application

Figure 10-1. Architecture of the Extended Java Shop

Exploring Deploy and Release for Serverless and IaaS

Separating Deployment and Release

Deploying Applications

Creating a Container Image

Installing the plugin

Creating the DockerHub credentials

Figure 10-2. Creating new credentials in Jenkins

Figure 10-3. Adding new Docker Hub credential to Jenkins

Building and publishing

Figure 10-4. Creating a new build step to build and publish a Docker image

Figure 10-5. Indicating the folder where the Dockerfile resides

Deployment Mechanisms

The plugin example: Kubernetes

Installing the plugin

Kubernetes Plugin Versus Kubernetes CD Plugin

Preparing the configuration files

Registering Kubernetes credentials

Figure 10-6. Adding new Kubernetes credentials to Jenkins

Creating service definitions

Example 10-1. Kubernetes service definition sample for Feature Flags service in Extended Java Shop

Creating the deployment job

Figure 10-7. Adding a build step to deploy to Kubernetes

Figure 10-8. Configuring the build step to deploy to Kubernetes

The CLI example: Amazon ECS

Combining Kubernetes and Amazon EC2

Going Serverless with ECS

Installing and configuring the CLI

Getting the Latest AWS CLI

Amazon ECS concepts

Creating tasks, deploying services

Example 10-2. Amazon ECS task definition for Shopfront service in Extended Java Shop

It All Starts (and Ends) with Health Checks

Providing health-check endpoints

Example 10-3. Partial extract of the startup log for Stock Manager service

Example 10-4. A basic health check in Dropwizard

Dropwizard Exposes Health Checks at a Different Port

Consulting health-check endpoints

Keep Your Health Checks as Simple as Possible

Deployment Strategies

Single target deployment

Figure 10-9. Single target deployment: the old instance is killed and replaced by a new one

All-at-once deployment

Figure 10-10. All-at-once deployment: all old instances are killed at the same time, and new ones are brought in as the old ones are gone

Minimum in-service deployment

Figure 10-11. Minimum in-service deployment: at least two instances, either new ones or old ones, have to be operational at any given point

Rolling deployment

Figure 10-12. Rolling deployment, where the maximum number of absent instances is one.

Blue/green deployment

Figure 10-13. A blue/green deployment: new instances are not visible by the user until all of them are ready; at that moment, the router changes to point to the new instances, making the old ones inaccessible to users

Canary deployment

Figure 10-14. Canary deployment, where a new instance is simply added to the group of current-version instances without replacing any

Which deployment type should I choose?

Beware of Longer Warm-up Times

Working with Unmanaged Clusters

Accounting for Resources Once

Deployment Scripts Are Code

A generic strategy

Deploying a single instance

Bringing down a single instance

Transactions and Data Constraints in Your Deployments Database

Rerouting

Changing Databases

Managing database deployments

Separating database and application deployments

Communicating via stored procedures: turning the database into just another service

Releasing Functionality

Feature Flags

Example 10-5. Using a feature flag to decide on-the-fly whether an adaptive price should be used instead of the original one

Example 10-6. Mechanism to decide whether a given throttle feature flag should be applied

Semantic Versioning (semver)

What’s with the v?

Backward Compatibility and Versions in APIs

Avoid versioning

Table of Contents for
10. Deploying and Releasing from the Pipeline