When implementing any core technology in production, the most mileage is often gained by designing a resilient platform that is forgiving of the unexpected issues that will eventually occur. When used as intended, with close attention to detail, Docker can be an incredibly powerful tool. As a young technology that is going through very rapid growth cycles, you are bound to trigger frustrating bugs in Docker and its interactions with the underlying kernel.
If, instead of simply deploying Docker into your environment, you take the time to build a well-designed container platform on top of Docker, you can enjoy the many benefits of a Docker-based workflow while protecting yourself from some of the sharper exposed edges that typically exist in such a high-velocity project.
Like all other technology, Docker doesn’t magically solve all your problems. To reach its true potential, organizations must make very conscious decisions about why and how they are going to use it. For very small projects, it is possible to use Docker in a very simple manner; however, if you plan to support a large project that can scale with demand, it quickly becomes important to ensure that you are making very deliberate decisions about how your applications and platform are designed. This ensures that you can maximize the return on your investment in the technology. Taking the time to design your platform with intention can also make it much easier to modify your production workflow over time. A well-designed Docker platform will ensure that your software is running on a dynamic foundation that can easily be upgraded as technology and processes develop over time.
Below we will explore some of the leading thinking about how container platforms should be designed to improve the resiliency and supportability of the overall platform.
In November of 2011, Adam Wiggins, cofounder of Heroku, and his colleagues, released an article called “The Twelve-Factor App.” This document describes a series of 12 distilled practices that come from the experiences of Heroku engineers on how to design applications that will thrive and grow in a modern Software-as-a-Service (SaaS) environment.
Although not required, applications built with these 12 steps in mind are ideal candidates for the Docker workflow. Below we explore each of these steps and why they can, in numerous ways, help improve your development cycle.
A single codebase tracked in revision control.
Many instances of your application will be running at any given time, but they should all come from the same code repository. Each and every Docker image of an application should be built from a single source code repository that contains all the code required to build the Docker container. This ensures that the code can easily be rebuilt, and that all third-party requirements are well defined within the repository, if not actually directly included.
What this means is that building your application shouldn’t require stitching together code from multiple source repositories. That is not to say that you can’t have a dependency on an artifact from another repo. But it does mean that there should be a clear mechanism for determining which pieces of code were shipped when you built your application. Docker’s ability to simplify dependency management is much less useful if building your application requires pulling down mutiple source code repositories and copying pieces together. And it’s not repeatable unless you know the magic incantation.
Explicitly declare and isolate dependencies.
Never rely on the belief that a dependency will be made available via some other avenue, like the operating system install. Any dependencies that your application requires should be well-defined in the code base and pulled in by the build process. This will help ensure that your application will run when deployed, without relying on libraries being installed by other processes. This is particularly important within a container since the container’s processes are isolated from the rest of the host operating system and will usually not have access to anything outside the container image’s filesystem.
The Dockerfile and language-dependent configuration files like Node’s package.json or Ruby’s Gemfile should define every nonexternal dependency required by your application. This ensures that your image will run correctly on any system to which it is deployed. Gone will be the days when you deploy and run your application only to find out that important libraries are missing or installed with wrong version. This has huge reliability and repeatability advantages, and very positive ramifications for system security. If you update the OpenSSL or libyaml
libraries that your Dockerized application uses to fix a security issue, you can be assured that it will always be running with that version wherever you deploy that particular application.
It is also important to note that most Docker base images are actually much larger than they need to be. Remember that your application process will be running on a shared kernel, and the only files that you actually need inside your base image are ones that the process will require to run. It’s good that the container makes this repeatable. But it can sometimes mask hidden dependencies. Although people often start with a minimal install of Ubuntu
or CentOS
, these images still contain a lot of operating system files that your process almost certainly does not need, or possibly some that you rely on and don’t realize it. You need to be in charge of your dependencies, even when containerizing your application.
A good way to shed light on the depth of your application’s dependency tree is to compare it to a container for a statically linked program written in a language like Go or C. They don’t need any libraries or command-line binaries. To explore what one of these ultra-light containers might look like, let’s run a statically linked Go program in a container by executing the following command:
$ docker run --publish=8085:8080 --detach=true --name=static-helloworld adejonge/helloworld:latest 365cc5ddb0c40a50763217c66be26959933028631ef24a60a7da9944971587a3
Keep a copy of the long ID hash for your container, because you will need it in a moment. If you now point a local web browser at port 8085
on your Docker server (i.e., http://172.17.42.10:8085/) you would see the message:
Hello World from Go in minimal Docker container
Contrary to everything we’ve looked at in this book so far, this very minimal container does not contain a shell or SSH. This means we can’t use ssh
, nsenter
, or docker exec
to examine it. Instead we can examine the container’s filesystem by logging directly into the Docker server via ssh
, and then looking into the container’s filesystem itself. To do that, we need to find the filesystem on the server’s disk. We do this by first running docker info
to determine the root directory for the storage driver.
$ docker info ... Storage Driver: aufs Root Dir: /mnt/sda1/var/lib/docker/aufs ...
The Docker Root Dir
and the Root Dir
are not the same things. We specifically want the Root Dir
listed under Storage Driver
.
By combining the Docker root directory and the container hash into a file path, it is possible to view the container’s filesystem from the Docker server. You might need to poke around in the storage driver’s root directory a bit to determine the exact location of the container filesystems. In our case, it is under the additional directory called mnt
.
If we now list the files in that directory, we will discover that the number of files in this container is incredibly small:
$ ls -R /mnt/sda1/var/lib/docker/aufs/mnt/36...a3 /mnt/sda1/var/lib/docker/aufs/mnt/36...a3: dev/ etc/ helloworld proc/ sys/ /mnt/sda1/var/lib/docker/aufs/mnt/36...a3/dev: console pts/ shm/ /mnt/sda1/var/lib/docker/aufs/mnt/36...a3/dev/pts: /mnt/sda1/var/lib/docker/aufs/mnt/36...a3/dev/shm: /mnt/sda1/var/lib/docker/aufs/mnt/36...a3/etc: hostname hosts mtab resolv.conf /mnt/sda1/var/lib/docker/aufs/mnt/36...a3/proc: /mnt/sda1/var/lib/docker/aufs/mnt/36...a3/sys:
You can see that in addition to console
device and basic /etc
files, the only other file is the helloworld
binary, which contains everything our simple web application needs to run on a modern Linux kernel, and hence from within a container.
In addition to the filesystem layers used by Docker, keeping your containers stripped down to the bare necessities is another great way to keep your images slim and your docker pull
commands fast. It’s much harder to do with interpreted languages living in a container. But the point is that you should try to keep as minimal a base layer as needed so that you can reason about your dependencies. Docker helps you package them up, but you still need to be in charge of them.
Store configuration in environment variables, not in files checked into the code base.
This makes it simple to deploy the exact same code base to different environments, like staging and production, without maintaining complicated configuration in code or rebuilding your container for each environment. This keeps your code base much cleaner by keeping environment-specific information like database names and passwords out of your source code repository. More importantly though, it means that you don’t bake deployment environment assumptions into the repository, and because of that it is extremely easy to deploy your applications anywhere that it might be useful. You also need to be able to test the same image you will ship to production. You can’t do that if you have to build an image for each environment with all of its configuration baked in.
As discussed in Chapter 5, this can be achieved by launching docker run
commands that leverage the -e
command-line argument. Using -e APP_ENV=production
tells Docker to set the environment variable APP_ENV to the value “production” within the newly launched container.
For a real-world example, let’s assume we pulled the image for the chat robot hubot
with the HipChat adapter installed. We’d issue something like the following command to get it running:
docker run -e BIND_ADDRESS="0.0.0.0" -e ENVIRONMENT="development" -e SERVICE_NAME="hubot" -e SERVICE_ENV="development" -e EXPRESS_USER="hubot" -e EXPRESS_PASSWORD="Chd273gdExAmPl3wlkjdf" -e PORT="8080" -e HUBOT_ADAPTER="hipchat" -e HUBOT_ALIAS="/" -e HUBOT_NAME="hubot" -e HUBOT_HIPCHAT_JID="[email protected]" -e HUBOT_HIPCHAT_PASSWORD='SOMEEXAMPLE' -e HUBOT_HIPCHAT_NAME="hubot" -e HUBOT_HIPCHAT_ROOMS="[email protected]" -e HUBOT_HIPCHAT_JOIN_ROOMS_ON_INVITE="true" -e REDIS_URL="redis://redis:6379" -d --restart="always" --name hubot hubot:latest
Here we are passing a whole set of environment variables into the container when it is created. When the process is launched in the container, it will have access to these environment variables so that it can properly configure itself at runtime. These configuration items are now an external dependency that we can inject at runtime.
In the case of a Node.js application like hubot
, you could then write the following code to make decisions based on these environment variables:
switch(process.env.ENVIRONMENT){ case 'development': console.log('Running in development'), case 'staging': console.log('Running in staging'), case 'production': console.log('Running in production'), default: console.log('Assuming that I am running in development'), }
Keeping specific configuration information out of your source code makes it very easy to deploy the exact same container to multiple environments, with no changes and no sensitive information committed into your source code repository. Crucially, it supports testing your container images thoroughly before deploying to production by allowing the same image to be used in both environments.
Treat backing services as attached resources.
Local databases are no more reliable than third-party services, and should be treated as such. Applications should handle the loss of an attached resource gracefully. By implementing graceful degradation in your application and ensuring that you never assume that any resource, including filesystem space, is available, your application will continue to perform as many of its functions as it can, even when external resources are unavailable.
This isn’t something that Docker helps you with directly, and although it is always a good idea to write robust services, it is even more important when you are using containers. High availability is most often achieved through horizontal scaling and rolling deployments when using containers, instead of relying on the live migration of long-running process, like on traditional virtual machines. This means that specific instances of a service will often come and go over time and your service should be able to handle this gracefully.
Additionally, because Docker containers have limited filesystem resources, you can’t simply rely on having some local disk available. You need to plan that into your application’s dependencies and handle it explicitly.
Strictly separate build and run stages.
Build the code, release it with the proper configuration, and then deploy it. This ensures that you maintain control of the process and can perform any single step without triggering the whole workflow. By ensuring that each of these steps are self-contained in a distinct process, it allows you to tighten the feedback loop and react more quickly to any problems within the deployment flow.
As you design your Docker workflow, you want to ensure that each step in the deployment process is clearly separated. It is perfectly fine to have a single button, that builds a container, tests it, and then deploys it, assuming that you trust your testing processes—but you don’t want to be forced to rebuild a container simply in order to deploy it to another environment.
Docker supports the twelve-factor ideal well in this area because there is a clean hand-off point between building an image and shipping it to production: the registry. If your build process generates images and pushes them to the registry, then deployment can simply be pulling the image down to servers and running it.
Execute the app as one or more stateless processes.
All shared data must be accessed via a stateful backing store, so that application instances can easily be redeployed without losing any important session data. You don’t want to keep critical state on disk in your ephemeral container, nor in the memory of one of its processes. Containerized applications should always be considered ephemeral. A truly dynamic container environment requires the ability to destroy and recreate containers at a moment’s notice. This flexibility helps enable the rapid deployment cycle and outage recovery demanded by modern, Agile workflows.
As much as possible, it is preferable to write applications that do not need to keep state longer than the time required to process and respond to a single request. This ensures that the impact of stopping any given container in your application pool is very minimal. When you must maintain state, the best approach is to use a remote datastore like Redis, PostgreSQL, Memcache, or even Amazon S3, depending on your resiliency needs.
Export services via port binding.
Your application needs to be addressable by a port specific to itself. Applications should bind directly to a port to expose the service and should not rely on an external daemon like inetd to handle that for them. You should be certain that when you’re talking to that port, you’re talking to your application. Most modern web platforms are quite capable of directly binding to a port and servicing their own requests.
Exposing a port from your container, as discussed in Chapter 4, can be achieved by launching docker run
commands that use the -p
command-line argument. Using -p 80:8080
would tell Docker to proxy the container’s port 8080 on the host’s port 80.
The statically linked Go hello world container that we discussed in “Dependencies” is a great example of this because the container contains nothing but our application to serve its content to a web browser. We did not need to include any additional web servers, which would require additional configuration, add additional complexity, and increase the number of potential failure points in our system.
Scale out via the process model.
Design for concurrency and horizontal scaling within your applications. Increasing the resources of an existing instance can be difficult and hard to reverse. Adding and removing instances as scale fluctuates is much easier and helps maintain flexibility in the infrastructure. Launching another container on a new server is incredibly inexpensive compared to the effort and expense required to add resources to an underlying virtual or physical system. Designing for horizontal scaling allows the platform to react much faster to changes in resource requirements.
This is where tools like swarm
, mesos
, and kubernetes
really begin to shine. Once you have implemented a Docker cluster with a dynamic scheduler, it is very easy to add three more instances of a container to the cluster as load increases, and then to later remove two instances of your application from the cluster as load starts to decrease again.
Maximize robustness with fast startup and graceful shutdown.
Services should be designed to be ephemeral. We already talked a little bit about this when talking about external state. But dynamic horizontal scaling, rolling deploys, and responding to unexpected problems require applications that can quickly and easily be started or shut down. Services should respond gracefully to a SIGTERM
signal from the operating system and even handle hard failures with aplomb. Most importantly, we shouldn’t need to care if any given container for our application is up and running. As long as requests are being served, the developer should be freed from being concerned about the health of any given single component within the system. If an individual node is behaving poorly, turning it off or redeploying it should be an easy decision that doesn’t entail long planning sessions and concerns about the health of the rest of the cluster.
As discussed in Chapter 8, Docker sends standard Unix signals to containers when it is stopping or killing them, therefore it is possible for any containerized application to detect these signals and take the appropriate steps to shut down gracefully.
Keep development, staging, and production as similar as possible.
The same processes and artifacts should be used to build, test, and deploy services into all environments. The same people should do the work in all environments, and the physical nature of the environments should be as similar as reasonably possible. Repeatability is incredibly important. Almost any issue discovered in production points to a failure in the process. Every area where production diverges from staging is an area where risk is being introduced into the system. These inconsistencies ensure that you are blind to certain types of issues that could occur in your production environment until it is too late to proactively deal with them.
In many ways, this repeats the essence of a few of the early recommendations. However, the specific point here is that any environment divergence introduces risks, and although these differences are common in many organizations, they are much less necessary in a containerized environment. Docker servers can normally be created so that they are identical in all of your environments and environment-based configuration changes, and should typically only affect which endpoints your service connects to without specifically changing the applications behavior.
Treat logs as event streams.
Services should not concern themselves with routing or storing logs. Instead, events should be streamed, unbuffered, to STDOUT
for handling by the hosting process. In development, STDOUT
can be easily viewed, while in staging and production, the stream can be routed to anything, including a central logging service. Different environments have different exceptions for log handling. This logic should never be hard-coded into the application. By streaming everything to STDOUT
, it is possible for the top-level process manager to handle the logs via whichever method is best for the environment, and this allows the application developer to focus on core functionality.
In Chapter 6, we discussed the docker logs
command which collects the output from your container’s STDOUT
and records them as logs. If you write logs to random files within the container’s filesystem, you will not have easy access to them. It is also possible to send logs to a local or remote logging system using things like rsyslog
, heka
, or fluentd
.
If you use a process manager or init system, like upstart
, systemd
, or supervisord
with the remote-logging plug-in, it is usually very easy to direct all process output to STDOUT
and then have your process monitor capture it and send it to a remote logging host.
Run admin/management tasks as one-off processes.
One-off administration tasks should be run via the exact same code base and configuration that the application uses. This helps avoid problems with synchronization and code/schema drift problems. Oftentimes, management tools exist as one-off scripts or live in a completely different code base. It is much safer to build management tools within the application’s code base, and utilize the same libraries and functions to perform required work. This can significantly improve the reliability of these tools by ensuring that they leverage the same code paths that the application relies on to perform its core functionality.
What this means is that you should never rely on random cron-like scripts to perform administrative and maintenance functions. Instead, include all of these scripts and functionality in your application code base. Assuming that these don’t need to be run on every instance of your application, you can launch a special short-lived container whenever you need to run a maintenance job, which simply executes the one job, reports its status somewhere, and then exits.
While it wasn’t written as a Docker-specific manifesto, almost all of this can be applied to writing and deploying applications on a Docker platform. This is in part because “The Twelve-Factor App” document heavily influenced the design of Docker, and in part because the manifesto itself codified many of the best practices promoted by modern software architects.
Riding alongside “The Twelve-Factor App,” another pertinent document was released in July of 2013 by Jonas Bonér, cofounder and CTO of Typesafe: “The Reactive Manifesto.” Jonas originally worked with a small group of contributors to solidify a manifesto that discusses how the expectations for application resiliency have evolved over the last few years, and how applications should be engineered to react in a predictable manner to various forms of interaction, including events, users, load, and failures.
In the Manifesto, it states that “Reactive Systems” are responsive, resilient, elastic, and message-driven.
The system responds in a timely manner if at all possible.
In general, this means that the application should respond to requests very quickly. User simply don’t want to wait, and there is almost never a good reason to make them. If you have a containerized service that renders large PDF files, design it so that it immediately responds with a job submitted message so that the user can go about his day, and then provide a message or banner that informs the user when the job is finished and where he can download the resulting PDF.
The system stays responsive in the face of failure.
When your application fails for any reason, the situation will always be worse if the application becomes unresponsive. It is much better to handle the failure gracefully, and dynamically reduce the application’s functionality or even display a simple but clear problem message to the user while reporting the issue internally.
The system stays responsive under varying workload.
With Docker, this is achieved by dynamically deploying and decommissioning containers as requirements and load fluctuate so that your application is always able to handle server requests quickly, without deploying a lot of underutilized resources.
Reactive systems rely on asynchronous message-passing to establish a boundary between components.
Although not directly addressed by Docker, the idea here is that there are times when an application can become busy or unavailable. If you utilize asynchronous message-passing between your services, you can help ensure that your service will not lose requests and that these will be processed as soon as possible.
All four of these design features require application developers to design graceful degradation and a clear separation of responsibilities into their applications. By treating all dependencies as attached resources, properly designed, dynamic container environments allow you to easily maintain n+2 status across your application stack, reliably scale individual services in your environment, and quickly replace unhealthy nodes.
The core ideas in “The Reactive Manifesto” merge very nicely with “The Twelve-Factor App” and the Docker workflow. These documents successfully attempt to frame many of the most important discussions about the way you need to think and work if you want to be successful in meeting new expectations in the industry. The Docker workflow provides a practical way to implement many of these ideas in any organization in a completely approachable way.
3.142.252.191