Containers

The last years have shown a lot of interest in Linux container technology. Technically this approach is not that new. Linux operating systems such as Solaris supported containers a long time ago. However, Docker made a the breakthrough in this technology by providing features to build, manage and ship containers in a uniform way.

What is the difference between containers and virtual machines (VMs) and what makes containers that interesting?

Virtual machines act like a computer in a computer. They allow the runtime to be easily managed from the outside such as creating, starting, stopping, and distributing machines in a fast and ideally automated way. If new servers need to be setup, a blueprint, an image, of the required type can be deployed without installing software from scratch every time. Snapshots of running environments can be taken to easily backup the current state.

In many ways containers behave like virtual machines. They are separated from the host as well as other containers, run in their own network with their own file system and potentially own resources. The difference is that virtual machines run on a hardware abstraction layer, emulating a computer including operating system, whereas containers run directly in the host's kernel. Unlike other kernel processes, containers are separated from the rest of the system using operating system functionality. They manage their own file system. Therefore, containers behave like separate machines but with native performance without the overhead of an abstraction layer. The performance of virtual machines is naturally decreased by their abstraction. Whereas virtual machines provide full flexibility in choosing operating systems, containers will always run in the same kernel and therefore in the same version as the host operating system. Containers therefore do not ship their own Linux kernel and can be minimized to their required binaries.

Container technologies such as Docker provide functionality to build, run, and distribute containers in a uniform way. Docker defines building container images as IaC which again enables automation, reliability, and reprocibility. Dockerfiles define all the steps that are required to install the application including its dependencies, for example, an application container and the Java runtime. Each step in the Dockerfile corresponds to a command that is executed at image build time. Once a container is started from an image it should contain everything which is required to fulfill its task.

Containers usually contain a single Unix process which represents a running service, for example an application server, a web server, or a database. If an enterprise system consists of several running servers, they run in individual containers.

One of the biggest advantages of Docker containers is that they make use of a copy-on-write file system. Every build step, as well as every running container later on, operates on a layered file system, which does not change its layers but only adds new layers on top. Built images therefore comprise multiple layers.

Containers that are created from images are always started with the same initial state. Running containers potentially modify files as new, temporary layers in the file system, which are discarded as soon as the containers are stopped. By default, Docker containers are therefore stateless runtime environments. This encourages the idea of reproducibility. Every persistent behavior needs to be defined explicitly.

The multiple layers are beneficial when rebuilding and redistributing images. Docker caches intermediate layers and only rebuilds and retransmits what has been changed.

For example, an image build may consist of multiple steps. System binaries are added first, then the Java runtime, an application server, and finally our application. When changes are made to the application and a new build is required, only the last step is re-executed; the previous steps are cached. The same is true for transmitting images over the wire. Only the layers that have been changed and that are not yet existent on the target registry, are actually retransmitted.

The following illustrates the layers of a Docker image and their individual distribution:

Docker images are either built from scratch, that is from an empty starting point, or built upon an existing base image. There are tons of base images available, for all major Linux distributions containing package managers, for typical environment stacks as well as for Java-based images. Base images are a way to build upon a common ground and provide basic functionality for all resulting images. For example, it makes sense to use a base image including a Java runtime installation. If this image needs to be updated, for example, to fix security issues, all dependent images can be rebuilt and receive the new contents by updating the base image version. As said before, software builds need to be repeatable. Therefore we always need to specify explicit versions for software artifacts such as images.

Containers that are started from previously built Docker images need access to these images. They are distributed using Docker registries such as the publicly available DockerHub or company-internal registries to distribute own images. Locally built images are pushed to these registries and retrieved on the environments that will start new containers later on.

Table of Contents for Containers

Create new playlist

Sign In

Sign Up

Table of Contents for
Containers