Now that we have some experience working with Docker containers and images, we can explore some of its other capabilities. In this chapter, we’ll continue to use the docker
command-line tool to talk to the running docker
daemon that you’ve configured, while visiting some of the other fundamental commands.
Docker provides commands to do a number of additional things easily:
Printing the Docker version
Viewing the server information
Downloading image updates
Inspecting containers
Entering a running container
Returning a result
Viewing logs
Monitoring statistics
Let’s take a look at some of those and some community tooling that augments Docker’s native capabilities.
If you completed the last chapter, you have a working Docker daemon on a Linux server or virtual machine, and you’ve started a base container to make sure it’s all working. If you haven’t set that up already and you want to try out the steps in the rest of the book, you’ll want to follow the installation steps in Chapter 3 before you move on with this section.
The absolute simplest thing you can do with Docker is print the versions of the various components. It might not sound like much, but this is a useful tool to have in your belt because the server and API are often not backwards compatible with older clients. Knowing how to show the version will help you troubleshoot certain types of connection issues. Note that this command actually talks to the remote Docker server. If you can’t connect to the server for any reason, the client will complain. If you find that you have a connectivity problem, you should probably revisit the steps in the last chapter.
You can always directly log in to the Docker server and run docker
commands from a shell on the server if you are troubleshooting issues or simply do not want to use the docker
client to connect to a remote system.
Since we just installed all of the Docker components at the same time, when we run docker version
, we should see that all of our versions match:
$ docker version Client version: 1.3.1 Client API version: 1.15 Go version (client): go1.3.3 Git commit (client): 4e9bbfa OS/Arch (client): linux/amd64 Server version: 1.3.1 Server API version: 1.15 Go version (server): go1.3.3 Git commit (server): 4e9bbfa
Notice how we have different lines representing the client, server, and API versions. It’s important to note that different versions of the command-line tools might use the same Docker API version. Even when they do, sometimes Docker won’t let you talk to a remote server that doesn’t exactly match. Now you know how to verify this information.
In versions of Docker previous to 1.10, the docker client would error whenever you tried to connect to a server using a older version of the API. This situation can now be easily dealt with, as shown in this example:
$ docker ps Error response from daemon: client is newer than server (client API version: 1.22, server API version: 1.21) $ export DOCKER_API_VERSION="1.21" $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
We can also find out a little bit about the Docker server itself via the Docker client. Later we’ll talk more about what all of this means, but you can find out which filesystem backend the Docker server is running, which kernel version it is on, which operating system it is running on, and how many containers and images are currently stored there. If you run docker info
, you will see something similar to this:
$ docker info Containers: 22 Images: 180 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Dirs: 224 Execution Driver: native-0.2 Kernel Version: 3.8.0-29-generic Operating System: Ubuntu precise (12.04.3 LTS)
Depending on how your Docker daemon is set up, this might look somewhat different. Don’t be concerned about that; this is just to give you an example. Here we can see that our server is an Ubuntu 12.04.3 LTS release running the 3.8.0 Linux kernel and backed with the AUFS filesystem driver. We also have a lot of images! With a fresh install, this number should be zero.
In most installations, /var/lib/docker will be the default root directory used to store images and containers. If you need to change this, you can edit your Docker startup scripts to launch the daemon, with the --graph
argument pointing to a new storage location. To test this by hand, you could run something like this:
$ sudo docker -d -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375 --graph="/data/docker"
We’re are going to use an Ubuntu base image for the following examples. Even if you have already grabbed the Ubuntu base image once, you can pull
it again and it will automatically pick up any updates that have been published since you last ran it. That’s because latest
is a tag that, by convention, is always moved to the most recent version of the image that has been published to the image registry. Invoking the pull
will look like this:
$ docker pull ubuntu:latest Pulling repository ubuntu 5506de2b643b: Download complete 511136ea3c5a: Download complete d497ad3926c8: Download complete ccb62158e970: Download complete e791be0477f2: Download complete 3680052c0f5c: Download complete 22093c35d77b: Download complete
That command pulled down only the layers that have changed since we last ran the command. You might see a longer or shorter list, or even an empty list, depending on when you ran it and what changes have been pushed to the registry since then.
It’s good to remember that even though you pulled latest
, docker
won’t automatically keep the local image up-to-date for you. You’ll be responsible for doing that yourself. However, if you deploy an image based on a newer copy of ubuntu:latest
, Docker will download the missing layers during the deployment just like you would expect.
As of Docker 1.6, it is now possible to pull a specific version of an image from Docker Hub or any registry based on Docker’s Registry 2.0 codebase by using the digest attached to the desired image. This is useful when you want to ensure that you are pulling a very specific image build and don’t want to rely on a tag, which can potentially be moved.
docker pull ubuntu@sha256:2f9a...82cf
Once you have a container created, running or not, you can now use docker
to see how it was configured. This is often useful in debugging, and also has some other information that can be useful when identifying a container.
For this example, let’s go ahead and start up a container.
$ docker run -d -t ubuntu /bin/bash 3c4f916619a5dfc420396d823b42e8bd30a2f94ab5b0f42f052357a68a67309b
We can list all our running containers with docker ps
to ensure everything is running as expected, and to copy the container ID.
$ docker ps CONTAINER ID IMAGE COMMAND ... STATUS ... NAMES 3c4f916619a5 ubuntu:latest "/bin/bash" ... Up 31 seconds ... angry_mestorf
In this case, our ID is 3c4f916619a5
. We could also use angry_mestorf
, which is the dynamic name assigned to our container. Underlying tools all need the unique container ID, though, so it’s useful to get into the habit of looking at that first. As is the case in many revision control systems, this hash is actually just the prefix of a much longer hash. Internally, the kernel uses a 64-byte hash to identify the container. But that’s painful for humans to use, so Docker supports the shortened hash.
The output to docker inspect
is pretty verbose, so we’ll cut it down in the following code block to a few values worth pointing out. You should look at the full output to see what else you think is interesting:
$ docker inspect 3c4f916619a5 [{ "Args": [], "Config": { "Cmd": [ "/bin/bash" ], "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ], "Hostname": "3c4f916619a5", "Image": "ubuntu", }, "Created": "2014-11-07T22:06:32.229471304Z", "Id": "3c4f916619a5dfc420396d823b42e8bd30a2f94ab5b0f42f052357a68a67309b", "Image": "5506de2b643be1e6febbf3b8a240760c6843244c41e12aa2f60ccbb7153d17f5" }
Note that long "Id"
string. That’s the full unique identifier of this container. Luckily we can use the short name, even if that’s still not especially convenient. We can also see the exact time when the container was created in a much more precise way than docker ps
gives us.
Some other interesting things are shown here as well: the top-level command in the container, the environment that was passed to it at creation time, the image on which it’s based, and the hostname inside the container. All of these are configurable at container creation time if you need to do so. The usual method for passing configuration to containers, for example, is via environment variables, so being able to see how a container was configured via docker inspect
can reveal a lot when debugging.
You can pretty easily get a shell running in a new container as we demonstrated above with docker run
. But it’s not the same as getting a new shell inside an existing container that is actively running your application. Every time you use docker run
, you get a new container. But if you have an existing container that is running an application and you need to debug it from inside the container, you need something else.
Because Docker originally used the LXC backend by default, the Linux lxc-attach
command was the easiest way to enter a running container. But once Docker shifted to using libcontainer
by default, this is no longer useful for most people. Since Docker containers are Linux namespaces, however, tools like the docker exec
command and nsenter
support this functionality more broadly.
First, let’s look at the newest and best way to get inside a running container. From Docker 1.3 and up, the docker
daemon and docker
command-line tool support remotely executing a shell into a running container via docker exec
. So let’s start up a container in background mode, and then enter it using docker exec
.
We’ll need our container’s ID, like we did above when we inspected it. I just did that, and my container’s ID is 589f2ad30138. We can now use that to get inside the container. The command line to docker exec
, unsurprisingly, looks a lot like the command line to docker run
. We request a pseudo-tty and an interactive command:
$ docker exec -t -i 589f2ad30138 /bin/bash root@589f2ad30138:/#
Note that we got a command line back that tells us the ID of the container we’re running inside. That’s pretty useful for keeping track of where we are. We can now run a ps
to see what else is running inside our container. We should see our other bash
process that we backgrounded earlier.
root@589f2ad30138:/# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 23:13 ? 00:00:00 /bin/bash root 9 0 1 23:14 ? 00:00:00 /bin/bash root 17 9 0 23:14 ? 00:00:00 ps -ef
You can also run additional processes in the background via docker exec
. You use the -d
option just like with docker run
. But you should think hard about doing that for anything but debugging because you lose the repeatability of the image deployment if you depend on this mechanism. Other people would then have to know what to pass to docker exec
to get the desired functionality. If you’re tempted to do this, you would probably reap bigger gains from rebuilding your container image to launch both processes in a repeatable way.
Part of the core util-linux
package from kernel.org is nsenter
, short for “Namespace Enter,” which allows you to enter any Linux namespace. In Chapter 10, we’ll go into more detail on namespaces. But they are the core of what makes a container a container. Using nsenter
, therefore, we can get into a Docker container from the server itself, even in situations where the Docker daemon is not responding and we can’t use docker exec
. nsenter
can also be used to manipulate things in a container as root on the server that would otherwise be prevented by docker exec
, for example. This can be really useful when debugging. Most of the time, docker exec
is all you need. But you should have nsenter
in your tool belt.
Most Linux distributions ship with the util-linux
package that contains nsenter
. But few ship with one that is new enough to have nsenter
itself installed, because it’s a recent addition to the package. So the easiest way to get ahold of nsenter
is to install it via a third-party Docker container. This works by pulling a Docker image from the Docker Hub registry and then running a specially crafted Docker container that will install the nsenter
command-line tool into /usr/local/bin. This might seem strange at first, but it’s a clever way to allow you to install nsenter
to any Docker server remotely using nothing more than the docker
command.
The following code shows how we install nsenter
to /usr/local/bin on your Docker server:
$ docker run --rm -v /usr/local/bin:/target jpetazzo/nsenter Unable to find image 'jpetazzo/nsenter' locally Pulling repository jpetazzo/nsenter 9e4ef84f476a: Download complete 511136ea3c5a: Download complete 71d9d77ae89e: Download complete Status: Downloaded newer image for jpetazzo/nsenter:latest Installing nsenter to /target Installing docker-enter to /target
You should be very careful about doing this! It’s always a good idea to check out what you are running, and particularly what you are exposing part of your filesystem to, before you run a third-party container on your system. With -v
, we’re telling Docker to expose the host’s /usr/local/bin directory into the running container as /target
. When the container starts, it is then copying an executable into that directory on our host’s filesystem. In Chapter 10, we will discuss some security frameworks and commands that can be leveraged to prevent potentially nefarious container activities.
Unlike docker exec
, which can be run remotely, nsenter
requires that you run it on the server itself. The README
in the GitHub repo explains how to set this up to work over SSH automatically if you want to do that. For our purposes, we’ll log in to our Docker server via SSH and then invoke the command from the server. In any case, like with docker exec
, we need to have a container running. You should still have one running from above. If not, go back and start one, and then ssh
into your server.
docker exec
is pretty simple, but nsenter
is a little inconvenient to use. It needs to have the PID of the actual top-level process in your container. That’s less than obvious to find and requires a few steps. Luckily there’s a convenience wrapper installed by that Docker container we just ran, called docker-enter
, which takes away the pain. But before we jump to the convenience wrapper, let’s run nsenter
by hand so you can see what’s going on.
First we need to find out the ID of the running container, because nsenter
needs to know that to access it. This is the same as previously shown for docker inspect
and docker exec
:
$ docker ps CONTAINER ID IMAGE COMMAND ... NAMES 3c4f916619a5 ubuntu:latest "/bin/bash" ... grave_goldstine
The ID we want is that first field, 3c4f916619a5. Armed with that, we can now find the PID we need. We do that like this:
$ PID=$(docker inspect --format {{.State.Pid}} 3c4f916619a5)
This will store the PID we care about into the PID environment variable. We need to have root privilege to do what we’re going to do. So you should either su
to root or use sudo
on the command line. Now we invoke nsenter
:
$ sudo nsenter --target $PID --mount --uts --ipc --net --pid root@3c4f916619a5:/#
If the end result looks a lot like docker exec
, that’s because it does almost exactly the same thing under the hood!
There are a lot of command-line options there, and what they’re doing is telling nsenter
which parts of the container we need access to. Generally you want all of them, so you might expect that to be the default, but it’s not, so we specify them all.
Neither nsenter
or docker exec
work well for exploring a container that does not contain a Unix shell. In this case you usually need to explore the container from the Docker server by navigating directly to where the container filesystem resides on storage. This will typically look something like this /var/lib/docker/aufs/mnt/365c…87a3,, but will vary based on the Docker setup, storage backend, and container hash. You can determine your Docker root directory by running docker info
.
Back at the beginning of this section we mentioned that there is a convenience wrapper called docker-enter
that gets installed by running the installation Docker container. Having now seen the mechanism involved with running nsenter
, you can now appreciate that if you actually just want to enter all the namespaces for the container and skip several steps, you can do this:
$ sudo docker-enter 3c4f916619a5 /bin/bash root@3c4f916619a5:/#
In Docker 1.9, a new volume
subcommand was added to the docker
client. Using this, it is possible to list all of the volumes stored in your root directory and then discover additional information about them, including where they are physically stored on the server.
# docker volume ls DRIVER VOLUME NAME local ca5ee542deefe42ad9004... local 6680f5dabe4dcd73b89bd... local 2be661504f3767227fd37... # docker volume inspect 2be661504f3767227fd37... [ { "Name": "2be661504f3767227fd37...", "Driver": "local", "Mountpoint": "/var/lib/docker/volumes/2be661504f3767227fd37.../_data" } ]
The volume
subcommand also allows you to create and remove volumes.
With these commands, you should be able to explore your containers in great detail. Once we’ve explained namespaces more in Chapter 10, you’ll get a better understanding of exactly how all these pieces interact and combine to create a container.
One way or another, either by launching a container with a foreground shell or via one of the other mechanisms above, we’ve got a shell running inside a container. So, let’s look around a little bit. What processes are running?
$ ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 22:12 ? 00:00:00 /bin/bash root 12 1 0 22:16 ? 00:00:00 ps -ef
Wow, that’s not much, is it? It turns out that when we told docker
to start bash
, we didn’t get anything but that. We’re inside a whole Linux distribution image, but no other processes started for us automatically. We only got what we asked for. It’s good to keep that in mind going forward.
Docker containers don’t, by default, start anything in the background like a full virtual machine would. They’re a lot lighter weight than that and therefore don’t start an init
system. You can, of course, run a whole init
system if you need to, but you have to ask for it. We’ll talk about that in a later chapter.
That’s how we get a shell running in a container. You should feel free to poke around and see what else looks interesting inside the container. Note that you might have a pretty limited set of commands available. You’re in an Ubuntu distribution, though, so you can fix that by using apt-get
to install more packages. Note that these are only going to be around for the life of this container. You’re modifying the top layer of the container, not the base image!
Most people would not think of spinning up a virtual machine to run a single process and then return the result because doing so would be very time consuming and require booting a whole operating system to simply execute one command. But Docker doesn’t work the same way as virtual machines: containers are very lightweight and don’t have to boot up like an operating system. Running something like a quick background job and waiting for the exit code is a normal use case for a Docker container. You can think of it as a way to get remote access to a containerized system and have access to any of the individual commands inside that container with the ability to pipe data to and from them and return exit codes.
This can be useful in lots of scenarios: you might, for instance, have system health checks run this way remotely, or have a series of machines with processes that you spin up via Docker to process a workload and then return. The docker
command-line tools proxy the results to the local machine. If you run the remote command in foreground mode and don’t specify doing otherwise, docker
will redirect its stdin
to the remote process, and the remote process’s stdout
and stderr
to your terminal. The only things we have to do to get this functionality are to run the command in the foreground and not allocate a TTY on the remote. This is actually the default configuration! No command-line options are required. We do need to have a container configured and ready to run.
The following code shows what you can do:
$ docker run 8d12decc75fe /bin/false $ echo $? 1 $ docker run 8d12decc75fe /bin/true $ echo $? 0 $ docker run 8d12decc75fe /bin/cat /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown nobody:x:99:99:Nobody:/:/sbin/nologin $ docker run 8d12decc75fe /bin/cat /etc/passwd | wc -l 8
Here we executed /bin/false
on the remote server, which will always exit with a status of 1. Notice how docker
proxied that result to us in the local terminal. Just to prove that it returns other results, we also run /bin/true
, which will always return a 0. And there it is.
Then we actually ask docker
to run cat /etc/passwd
on the remote container. What we get is the result of that container’s version of the /etc/passwd
file. Because that’s just regular output on stdout
, we can pipe it into local commands just like we would anything else.
The previous code pipes the output into the local wc
command, not a wc
command in the container. The pipe itself is not passed to the container. If you want to pass the whole command, including the pipes, to the server, you need to invoke a complete shell on the remote side and pass a quoted command, like bash -c "<your command> | <something else>"
. In the previous code, that would be: docker run 8d12decc75fe /bin/bash -c "/bin/cat /etc/passwd | wc -l"
.
Logging is a critical part of any production application. There are common ways we expect to interact with application logs on Linux systems. If you’re running a process on a box, you might expect the output to go to a local logfile that you could read through. Or perhaps you might expect the output to simply be logged to the kernel buffer where it can be read from dmesg
. Because of the container’s restrictions, neither of these will work without some gyrations to do so. But that’s OK, because logging is first class in Docker. First we’ll talk about the simple case, using the default logging mechanism. There are limitations to this mechanism, which we’ll explain in a minute, but for the simple case it works well. The mechanism is docker logs
.
The way this works is that anything sent to stdout
or stderr
in the container is captured by the Docker daemon and streamed into a configurable backend, which is by default a JSON file for each container. We’ll cover that first, then talk about other options. The default logging mechanism lets us retrieve logs for any container at any time like this, showing some logs from a container running nginx
:
$ docker logs 3c4f916619a5 nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: using the "epoll" ... nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: nginx/1.0.15 nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: built by gcc 4.4.7 ... nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: OS: Linux 3.8.0-35-generic
This is nice because Docker allows you to get the logs remotely, right from the command line, on demand. That’s really useful for low-volume logging.
To limit the log output to more recent logs, you can use the --since
option to only display logs after a specified RFC 3339 date (e.g., 2002-10-02T10:00:00-05:00), Unix timestamp (e.g., 1450071961), or Go duration string (e.g., 5m45s). You may also use --tail
followed by a number of lines to tail.
The actual files backing this logging are on the Docker server itself, by default in /var/lib/docker/containers/<container_id>
/ where the <container_id>
is replaced by the actual container ID. If you take a look at one of those files, you’ll see it’s a file with each line representing a JSON object. It will look something like this:
{"log":"2015-01-02 23:58:51,003 INFO success: running. ", "stream":"stdout", "time":"2015-01-02T23:58:51.004036238Z"}
That log field is exactly what was sent to stdout
on the process in question; the stream field tells us that this was stdout
and not stderr
; and the precise time that the Docker daemon received it is provided in the time field. It’s an uncommon format for logging, but it’s structured rather than just a raw stream, which is beneficial if you want to do anything with the logs later.
Like a logfile, you can also tail the Docker logs live with docker logs -f
:
$ docker logs -f 3c4f916619a5 nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: using the "epoll" ... nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: nginx/1.0.15 nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: built by gcc 4.4.7 ... nginx stderr | 2014/11/20 00:34:56 [notice] 12#0: OS: Linux 3.8.0-35-generic
By configuring the tag log option similar to --log-opt tag=" {{.ImageName }}/ {{.ID }}"
, it is possible to change the default log tag (which every log line will start with) to something more useful. By default, Docker logs will be tagged with the first 12 characters of the container ID.
This looks identical to the usual docker logs
, but the client then blocks, waiting on and displaying any new logs to appear, much like the Linux command line tail -f
.
The default settings do not currently enable log rotation. You’ll want to make sure you specify the --log-opt
max-size
and --log-opt
max-file
settings if running in production. Those limit the largest file size before rotation and the maximum number of log files to keep, respectively. max-file
does not do anything unless you’ve also set max-size
to tell Docker when to rotate the logs. Note that when this is enabled, the docker logs
mechanism will only return data from the current log file.
For single host logging, this mechanism is pretty good. Its shortcomings are around log rotation, access to the logs remotely once they’ve been rotated, and disk space usage for high-volume logging. For when this isn’t enough—and in production it’s probably not—Docker also supports configurable logging backends. Currently supported are the native json-file
we described above and syslog
, fluentd
, journald
, gelf
, awslogs
, and splunk
which are used for sending logs to various remote logging frameworks. The supported option that currently is the simplest, but not the best as we’ll describe, for running Docker at scale is the option to send your container logs to syslog
directly from Docker. You can specify this on the Docker command line with the --log-driver=syslog
option.
If you change the log driver to anything other than the default (json-file
), then you will no longer be able to use the docker logs
command. It is assumed that you will have another means of accessing your logs in that case.
Secondly, the Docker daemon itself will now be writing to /dev/log. This is usually read by the syslog
daemon. If that blocks, the logging will buffer into memory in the Docker process. At the time of this writing, further work is being done on this feature to mitigate that effect. As a result of this deficiency, we can’t currently recommend this solution at scale. This shortcoming affects the other remote mechanisms as well.
You can also log directly to a remote syslog server by setting the log option syslog-address similar to this: --log-opt syslog-address=tcp://192.168.42.42:123
.
It was a long time from the beginning of the Docker project to when logging by other than the json-file
method was supported. So there are community contributions with many alternate ways of getting you logging at scale. It should be noted that many of these mechanisms are also incompatible with docker logs
. The most common solution is to use a method to send your logs directly to syslog. There are several mechanisms in use:
Log directly from your application.
Have a process manager in your container relay the logs (e.g., systemd
, upstart
, supervisor
, or runit
).
Run a logging relay in the container that wraps stdout
/stderr
from the container.
Relay the Docker JSON logs themselves to a remote logging framework from the server or another container.
Some third-party libraries and programs, like supervisor
, write to the file system for various (and sometimes unexpected) reasons. If you are trying to design clean containers that do not write directly into the container filesystem, you should consider utilizing the --read-only
and --tmpfs
options to docker run
that we discussed in Chapter 5.
Many of these options share the drawbacks of changing the logging driver in Docker itself: they hide logs from docker logs
so you can see them on an individual container during debugging as easily without relying on an external application. So let’s take a look at how these options stack up.
Logging directly from your application to syslog might make sense, but if you’re running any third-party applications, that probably won’t work and it’s inflexible to changes once deployed inside tens of production applications. And unless you also emit logs on stdout
and stderr
, they will not be visible in docker logs
.
Spotify has released a simple, statically linked Go relay to handle logging your stderr
and stdout
to syslog for one process inside the container. Generally you run this from the CMD
line in the Dockerfile. Because it’s statically compiled, it has no dependencies, which makes it very flexible. It swallows the loglines, however, so they are not visible in docker logs
.
Although controversial in the Docker community, running a process manager in the container is a reasonably easy way to capture output and direct it to a central logging service. New Relic released a logging plug-in for supervisor
that does exactly that. This mechanism is nice because there is no additional machinery in the container other than what you’d already include. You do need Python installed, though, and it writes a certain amount of data to the container filesystem which can have deleterious effects on performance and container longevity. You can also emit the logs to stdout
and stderr
, which makes them available both locally with docker logs
and remotely.
If you want to have one system to support all your containers, a popular option is Logspout, which runs in a separate container, talks to docker
, and logs all the containers’ logs to syslog (UDP only). The advantage of this approach is that it does not preclude docker logs
but it does require that you set up log rotation.
Probably the best current option is to run Heka, from Mozilla. This is a mature and very robust logging framework with extensive filtering and routing options for your logs. It also has first-class support for Docker logging. You can run a single Heka daemon in either a container or natively on the Docker host and, like Logspout, it will attach to and follow all the logs from all of your containers. It’s much more flexible than Logspout, however, and supports a wide array of outputs and transformations. It is pretty simple to set up the first time but can scale to the limits of whatever you might want to do. This is what we currently recommend.
Finally, while you really should be capturing your logs somewhere, there are rare situations in which you simply don’t want any logging. You can use the --log-driver=none
switch to turn them off completely.
Running something in production is not a good idea unless you can tell what’s going on. In the modern world, we monitor everyting and report as many statistics as we can. Docker supports some nice, basic reporting capabilities via docker stats
and docker events
. We’ll show you those and then look at a community offering from Google that does some nice graphing output as well.
In version 1.5.0, Docker added an endpoint for viewing stats of running containers. The command-line tool can stream from this endpoint and every few seconds report back on one or more listed containers, giving basic statistics information about what’s happening. docker stats
, like the Linux top
command, takes over the current terminal and updates the same lines on the screen with the current information. It’s hard to show that in print, so we’ll just give an example, but this updates every few seconds by default.
$ docker stats e64a279663aa CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O e64a279663aa 0.00% 7.227 MiB/987.9 MiB 0.73% 936 B/468 B
Here we can see the container ID (but not the name), the amount of CPU it’s currently consuming, the amout of memory it has in use, and the limit of what it’s allowed to use. The percentage of memory utilized is also provided to make it easier for the user to quickly determine how much free memory the container has available. And stats are provided for both in and out network bytes.
These are all of basic interest, but what they provide is not all that exciting. It turns out, though, that the Docker API provides a lot more information on the stats endpoint than is shown in the client. We’ve steered away from hitting the API in this book so far, but in this case the data provided by the API is so much richer that we’ll use curl
to call it and see what our container is doing. It’s nowhere near as nice to read, but there is a lot more detail. This is a good intro to calling the API yourself as well.
The /stats/
endpoint that we’ll hit on the API will continue to stream stats to us as long as we keep the connection open. Since as humans we can’t really parse usefully, we’ll just ask for one line and then use Python to “pretty print” it. In order for this command to work, you’ll need to have Python installed (version 2.6 or later). If you don’t and you still want to see the JSON output, you can skip the pipe to Python, but you’ll get plain, ugly JSON back.
Here we call localhost
, but you’ll want to use the hostname of your Docker server. Port 2375 is usually the right port. Note that we also pass the ID of our container in the URL we send to curl
.
You can usually inspect the contents of the DOCKER_HOST
environment variable, using something like echo $DOCKER_HOST
, to discover the hostname or IP address of the Docker server that you are using.
It is easiest to run through the following example if you are running Docker on a full Linux distribution, with the Docker server bound to the default unencrypted port 2375.
First, let’s go ahead and start up a container that you can read stats from:
$ docker run -d ubuntu:latest sleep 1000 91c86ec7b33f37da9917d2f67177ebfaa3a95a78796e33139e1b7561dc4f244a
Now that the container is running, we can get an ongoing stream of statistics information about the container in JSON format by running something like curl
with your container’s hash.
$ curl -s http://localhost:2375/v1/containers/91c8...244a/stats
This JSON stream of statistics will not stop on its own. So for now, we can use the Ctrl-C key combination to stop it.
To get a single group of statistics, we can run something similar to this:
$ curl -s http://localhost:2375/v1/containers/91c8...244a/stats | head -1
And finally, if we have Python or another tool capable of “pretty printing” JSON, we can make this output human-readable, as shown here:
$ curl -s http://localhost:2375/v1/containers/91c8...244a/stats | head -1 | python -m json.tool { "blkio_stats": { "io_merged_recursive": [], "io_queue_recursive": [], "io_service_bytes_recursive": [ { "major": 8, "minor": 0, "op": "Read", "value": 6098944 }, ...snip... ], "io_service_time_recursive": [], "io_serviced_recursive": [ { "major": 8, "minor": 0, "op": "Read", "value": 213 }, ...snip... ], "io_time_recursive": [], "io_wait_time_recursive": [], "sectors_recursive": [] }, "cpu_stats": { "cpu_usage": { "percpu_usage": [ 37320425 ], "total_usage": 37320425, "usage_in_kernelmode": 20000000, "usage_in_usermode": 0 }, "system_cpu_usage": 1884140000000, "throttling_data": { "periods": 0, "throttled_periods": 0, "throttled_time": 0 } }, "memory_stats": { "failcnt": 0, "limit": 1035853824, "max_usage": 7577600, "stats": { "active_anon": 1368064, "active_file": 221184, "cache": 6148096, "hierarchical_memory_limit": 9223372036854775807, "inactive_anon": 24576, "inactive_file": 5890048, "mapped_file": 2215936, "pgfault": 2601, "pgmajfault": 46, "pgpgin": 2222, "pgpgout": 390, "rss": 1355776, "total_active_anon": 1368064, "total_active_file": 221184, "total_cache": 6148096, "total_inactive_anon": 24576, "total_inactive_file": 5890048, "total_mapped_file": 2215936, "total_pgfault": 2601, "total_pgmajfault": 46, "total_pgpgin": 2222, "total_pgpgout": 390, "total_rss": 1355776, "total_unevictable": 0, "unevictable": 0 }, "usage": 7577600 }, "network": { "rx_bytes": 936, "rx_dropped": 0, "rx_errors": 0, "rx_packets": 12, "tx_bytes": 468, "tx_dropped": 0, "tx_errors": 0, "tx_packets": 6 }, "read": "2015-02-11T15:20:22.930379289-08:00" }
There is a lot of information in there. We won’t spend much time going into the details, but you can get quite detailed memory usage information, as well as blkio and CPU usage information. If you are using CPU or memory limits in your containers, this endpoint is very useful for finding out when you are hitting them.
If you are doing your own monitoring, this is a great endpoint to hit as well. Note that one drawback is that it’s one endpoint per container, so you can’t get the stats about all containers from a single call.
The docker
daemon internally generates an events stream around the container life cycle. This is how various parts of the system find out what is going on in other parts. You can also tap into this stream to see what life cycle events are happening for containers on your Docker server. This, as you probably expect by now, is implemented in the docker
CLI tool as another command-line argument. When you run this command, it will block and continually stream messages to you. Behind the scenes, this is a long-lived HTTP request to the Docker API that returns messages in JSON blobs as they occur. The docker
CLI tool decodes them and prints some data to the terminal.
This event stream is useful in monitoring scenarios or in triggering additional actions, like wanting to be alerted when a job completes. For debugging purposes, it allows you see when a container died even if Docker restarts it later. Down the road, this is a place where you might also find yourself directly implementing some tooling against the API. Here’s how we use it on the command line:
$ docker events 2015-02-18T14:00:39-08:00 1b3295bf300f: (from 0415448f2cc2) die 2015-02-18T14:00:39-08:00 1b3295bf300f: (from 0415448f2cc2) stop 2015-02-18T14:00:42-08:00 1b3295bf300f: (from 0415448f2cc2) start
In this example, we initiated a stop
signal with docker stop
, and the events stream logs this as a “die” message. The “die” message actually marks the beginning of the shutdown of the container. It doesn’t stop instantaneously. So, following the “die” message is a “stop” message, which is what Docker says when a container has actually stopped execution. Docker also helpfully tells us the ID of the image that the container is running on. This can be useful for tying deployments to events, for example, because a deployment usually involves a new image.
Once the container was completely down, we initiated a docker start
to tell it to run again. Unlike the “die/stop” operations, this is a single command that marks the point at which the container is actually running. We don’t get a message telling us that someone explicitly started it. So what happens when we try to start a container that fails?
2015-02-18T14:03:31-08:00 e64a279663aa: (from e426f6ef897e) die
Note that here the container was actually asked to start, but it failed. Rather than seeing a “start” and a “die,” all we see is a “die.”
If you have a server where containers are not staying up, the docker events
stream is pretty helpful in seeing what’s going on and when. But if you’re not watching it at the time, Docker very helpfully caches some of the events and you can still get at them for some time afterward. You can ask it to display events after a time with the --since
option, or before with the --until
option. You can also use both to limit the window to a narrow scope of time when an issue you are investigating may have occurred. Both options take ISO time formats like those in the previous example (e.g., 2015-02-18T14:03:31-08:00).
docker stats
and docker events
are useful but don’t yet get us graphs to look at. And graphs are pretty helpful when trying to see trends. Of course, other people have filled some of this gap. When you begin to explore the options for monitoring Docker, you will find that many of the major monitoring tools now provide some functionality to help you improve the visibility into your containers’ performance and ongoing state.
In addition to the commercial tooling provided by companies like DataDog, GroundWork, and New Relic, there are plenty of options for open source tools like Nagios.
One of the best open source options today comes from Google, which released its own internal container advisor as an open source project on GitHub, called cAdvisor. Although cAdvisor can be run outside of Docker, the easiest implementation is to simply run it as a Docker container.
To install cAdvisor on an Ubuntu-based system, all you need to do is run this code:
$ docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --detach=true --name=cadvisor google/cadvisor:latest Unable to find image 'google/cadvisor:latest' locally Pulling repository google/cadvisor f0643dafd7f5: Download complete ... ba9b663a8908: Download complete Status: Downloaded newer image for google/cadvisor:latest f54e6bc0469f60fd74ddf30770039f1a7aa36a5eda6ef5100cddd9ad5fda350b
On RHEL- and CentOS-based systems, you will need to add the following line to the docker run
command shown here: --volume=/cgroup:/cgroup
.
Once you have done this, you will be able to navigate to your Docker host on port 8080 to see the cAdvisor web interface (i.e., http://172.17.42.10:8080/) and the various detailed charts it has for the host and individual containers (see Figure 6-1).
cAdvisor provides a REST API endpoint, which can easily be queried for detailed information by your monitoring systems:
$ curl http://172.17.42.10:8080/api/v1.3/containers/ { "name": "/", "subcontainers": [ { "name": "/docker" } ], "spec": { "creation_time": "2015-04-05T00:05:40.249999996Z", "has_cpu": true, "cpu": { "limit": 1024, "max_limit": 0, "mask": "0-7" }, "has_memory": true, "memory": { "limit": 18446744073709551615, "swap_limit": 18446744073709551615 }, "has_network": true, "has_filesystem": true, "has_diskio": true }, "stats": [ { "timestamp": "2015-04-05T00:26:50.679218419Z", "cpu": { "usage": { "total": 123375166639, "per_cpu_usage": [ 41967365270, 8589893874, 11289461032, 14350545587, 11866977873, 13414428349, 12667210966, 9229283688 ], "user": 22990000000, "system": 43890000000 }, "load_average": 0 }, "diskio": {}, "memory": { "usage": 394575872, "working_set": 227770368, "container_data": { "pgfault": 91617, "pgmajfault": 0 }, "hierarchical_data": { "pgfault": 91617, "pgmajfault": 0 } }, "network": { "rx_bytes": 0, "rx_packets": 0, "rx_errors": 0, "rx_dropped": 0, "tx_bytes": 0, "tx_packets": 0, "tx_errors": 0, "tx_dropped": 0 }, "filesystem": [ { "device": "/dev/sda1", "capacity": 19507089408, "usage": 2070806528, "reads_completed": 1302, "reads_merged": 9, "sectors_read": 10706, "read_time": 1590, "writes_completed": 1283, "writes_merged": 1115, "sectors_written": 509824, "write_time": 4150, "io_in_progress": 0, "io_time": 590, "weighted_io_time": 5670 } ], "task_stats": { "nr_sleeping": 0, "nr_running": 0, "nr_stopped": 0, "nr_uninterruptible": 0, "nr_io_wait": 0 } }, ... } ] }
As you can see, the amount of detail provided here should be sufficient for many of your graphing and monitoring needs.
That gives you all the basics you need to start running containers. It’s probably worth downloading a container or two from the Docker Hub registry and exploring a bit on your own to get used to the commands we just learned. There are many other things you can do with Docker, including:
Copying files in and out of the container with docker cp
Saving a container’s filesystem to a tarball with docker export
Saving an image to a tarball with docker save
Docker has a huge feature set that you will likely grow into over time. Each new release adds more functionality as well. We’ll get into a lot more detail later on many other commands and features, but you should keep in mind that Docker’s whole feature set is huge.
3.138.126.169