Once you’ve shipped an application to production, there will come a day when it’s not working as expected. It’s always nice to know ahead of time what to expect when that day comes. Debugging a containerized application is not all that different from debugging a normal process on a system.
First, we’ll cover one of the easiest ways to see what’s going on inside your containers. By using the docker top
command, you can see the process list as your container understands it. It is also critical to understand that your application is not running in a separate system from the other Docker processes. They share a kernel, likely a filesystem, and depending on your container configuration, they may share network interfaces. That means you can get a lot of information about what your container is doing.
If you’re used to debugging applications in a virtual machine environment, you might think you would need to enter the container to inspect in detail an application’s memory or CPU use, or debug system calls. Not so! Despite feeling in many ways like a virtualization layer, processes in containers are just processes on the Docker host itself. If you want to see a process list across all of the Docker containers on a machine, you can just run ps
with your favorite command-line options right on the server, for example. Let’s look at some things you can do when debugging a containerized application.
Docker has a built-in command for showing what’s running inside a container: docker top <containerID>
. This is nice because it works even from remote hosts as it’s exposed over the Docker Remote API. This isn’t the only way to see what’s going on inside a container, but it’s the easiest to use. Let’s take a look at how that works here:
$ docker ps CONTAINER ID IMAGE COMMAND ... NAMES 106ead0d55af test:latest /bin/bash ... clever_hypatia $ docker top 106ead0d55af UID PID PPID C STIME TTY TIME CMD root 4548 1033 0 13:29 ? 00:00:00 /bin/sh -c nginx root 4592 4548 0 13:29 ? 00:00:00 nginx: master process nginx www-data 4593 4592 0 13:29 ? 00:00:00 nginx: worker process www-data 4594 4592 0 13:29 ? 00:00:00 nginx: worker process www-data 4595 4592 0 13:29 ? 00:00:00 nginx: worker process www-data 4596 4592 0 13:29 ? 00:00:00 nginx: worker process
We need to know the ID of our container, which we get from docker ps
. We then pass that to docker top
and get a nice listing of what’s running in our container, ordered by PID just as we’d expect from Linux ps
output.
Some oddities exist here, though. The primary one of these is namespacing of user IDs and filesystems.
For example, a user might exist in a container’s /etc/passwd that does not exist on the host machine. In the case where that user is running a process in a container, the ps
output on the host machine will show a numeric ID rather than a user name. In some cases, two containers might have users squatting on the same numeric ID, or mapping to an ID that is a completely different user on the host system.
For example, if you had a production Docker server using CentOS 7 and ran the following commands, you would see that UID 7 is named halt
:
$ id 7 uid=7(halt) gid=0(root) groups=0(root)
Don’t read too much into the UID number we are using here. It was chosen simply because it is used by default on both platforms but represents a different username.
If we then enter the standard Ubuntu container on that Docker host, you will see that UID 7 is set to lp
in /etc/passwd. By running the following commands, you can see that the container has a completely different perspective of who UID 7 is:
$ docker run -ti ubuntu:latest /bin/bash root@f86f8e528b92:/# grep x:7: /etc/passwd lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin root@f86f8e528b92:/# id lp uid=7(lp) gid=7(lp) groups=7(lp) root@409c2a8216b1:/# exit
If we then run ps au
on the Docker host while that container is running as UID 7 (-u 7
), we would see that the Docker host would show the container process as being run by halt
instead of lp
:
$ docker run -d -u 7 ubuntu:latest sleep 1000 5525...06c6 $ ps ua | grep sleep 1185 halt sleep 1000 1192 root grep sleep
This could be particulary confusing if a well-known user like nagios
or postgres
were configured on the host system but not in the container, yet the container ran its process with the same ID. This namespacing can make the ps
output look quite strange. It might, for example, look like the nagios
user on your Docker host is running the postgresql
daemon that was launched inside a container, if you don’t pay close attention.
One solution to this is to dedicate a nonzero UID to your containers. On your Docker hosts, you can create a container
user as UID 5000 and then create the same user in your base container images. If you then run all your containers as UID 5000 (-u 5000
), you will not only improve the security of your system by not running container processes as UID 0, but you will also make the ps
output on the Docker host easier to decipher by displaying the container
user for all of your running container processes.
Likewise, because the process has a different view of the filesystem, paths that are shown in the ps
output are relative to the container and not the host. In these cases, knowing it is in a container is a big win.
So that’s how you use the Docker tooling to look at what’s running in a container. But that’s not the only way, and in a debugging situation, it might not be the best way. If you hop onto a Docker server and run a normal Linux ps
to look at what’s running, you get a full list of everything containerized and not containerized just as if they were all equivalent processes. There are some ways to look at the process output to make things a lot clearer. Debugging can be facilitated by looking at the Linux ps
output in tree form so that you can see all of the processes descended from Docker. Here’s what that can look like using the BSD command-line flags. We’ll chop the output to just the part we care about:
$ ps axlfww ... /usr/bin/docker -d ... \_ docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 6379 ... ... \_ redis-server *:6379 ... \_ docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 27017 ... ... \_ mongod
Many of the ps
commands in the preceding example work only on true Linux distributions. Boot2Docker is based on Tiny Core Linux
, which uses busybox
and provides a stripped-down ps
command.
Here you can see that we’re running one Docker daemon and two instances of the docker-proxy
, which we will discuss in more detail in “Network Inspection”. Everything else under those processes represents Docker containers. In this example, we have two containers. They show up as top-level processes under docker
. In this case, we are running one Redis server in a container, and one MongoDB server in another container. Each container has a related docker-proxy
process that is used to map the required network ports between the container and the host Docker server. It’s pretty clear how they are related to each other, and we know they’re running in a container because they are in docker
’s process tree. If you’re a bigger fan of Unix SysV command-line flags, you can get a similar, but not as nice looking, tree output with ps -ejH
:
$ ps -ejH 40643 ... docker 43689 ... docker 43697 ... docker 43702 ... start 43716 ... java 46970 ... docker 46976 ... supervisord 46990 ... supervisor_remo 46991 ... supervisor_stdo 46992 ... nginx 47030 ... nginx 47031 ... nginx 47032 ... nginx 47033 ... nginx 46993 ... ruby 47041 ... ruby 47044 ... ruby
You can get a more concise view of the docker
process tree by using the pstree
command. Here, we’ll use pidof
to scope it to the tree belonging to docker
:
$ pstree `pidof docker` docker─┬─2*[docker───6*[{docker}]] ├─mongod───10*[{mongod}] ├─redis-server───2*[{redis-server}] └─18*[{docker}]
This doesn’t show us PIDs and therefore is only useful for getting a sense of how things hang together in our containers. But this is pretty nice output when there are a lot of containers running on a host. It’s far more concise and provides a nice high-level map of how things connect. Here we can see the same containers that were shown in the ps
output above, but the tree is collapsed so we get multipliers like 10*
when there are 10 duplicate processes.
We can actually get a full tree with PIDs if we run pstree
, as shown here:
$ pstree -p `pidof docker` docker(4086)─┬─docker(6529)─┬─{docker}(6530) │ ├─... │ └─{docker}(6535) ├─... ├─mongod(6675)─┬─{mongod}(6737) │ ├─... │ └─{mongod}(6756) ├─redis-server(6537)─┬─{redis-server}(6576) │ └─{redis-server}(6577) ├─{docker}(4089) ├─... └─{docker}(6738)
This output provides us with a very good look at all the processes attached to Docker and what they are running. It is, however, difficult to see the docker-proxy
in this output, since it is really just another forked docker
process.
If you’re logged in to the Docker server, you can inspect running processes in many of the same ways that you would on the host. Common debugging tools like strace
work as expected. In the following code, we’ll inspect a unicor
process running inside a Ruby webapp container:
$ strace -p 31292 Process 31292 attached - interrupt to quit select(11, [10], NULL, [7 8], {30, 103848}) = 1 (in [10], left {29, 176592}) fcntl(10, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) accept4(10, 0x7fff25c17b40, [128], SOCK_CLOEXEC) = -1 EAGAIN (...) getppid() = 17 select(11, [10], NULL, [7 8], {45, 0}) = 1 (in [10], left {44, 762499}) fcntl(10, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) accept4(10, 0x7fff25c17b40, [128], SOCK_CLOEXEC) = -1 EAGAIN (...) getppid() = 17
You can see that we get the same output that we would from noncontainerized processes on the host. Likewise, an lsof
shows us that the files and sockets that a process has open work as expected:
$ lsof -p 31292 COMMAND ... NAME ruby ... /data/app ruby ... / ruby ... /usr/local/rbenv/versions/2.1.1/bin/ruby ruby ... /usr/.../iso_8859_1.so (stat: No such file or directory) ruby ... /usr/.../fiber.so (stat: No such file or directory) ruby ... /usr/.../cparse.so (stat: No such file or directory) ruby ... /usr/.../libsasl2.so.2.0.23 (path dev=253,0, inode=1443531) ruby ... /lib64/libnspr4.so (path dev=253,0, inode=655717) ruby ... /lib64/libplc4.so (path dev=253,0, inode=655718) ruby ... /lib64/libplds4.so (path dev=253,0, inode=655719) ruby ... /usr/lib64/libnssutil3.so (path dev=253,0, inode=1443529) ruby ... /usr/lib64/libnss3.so (path dev=253,0, inode=1444999) ruby ... /usr/lib64/libsmime3.so (path dev=253,0, inode=1445001) ruby ... /usr/lib64/libssl3.so (path dev=253,0, inode=1445002) ruby ... /lib64/liblber-2.4.so.2.5.6 (path dev=253,0, inode=655816) ruby ... /lib64/libldap_r-2.4.so.2.5.6 (path dev=253,0, inode=655820)
Note that the paths to the files are all relative to the container’s view of the backing filesystem, which is not the same as the host view. Therefore, inspecting the version of the file on the host will not match the one the container sees. In this case, it’s probably best to enter the container to look at the files with the same view that the processes inside it have.
It’s possible to run the GNU debugger (gdb
) and other process inspection tools in the same manner as long as you’re root and have proper permissions to do so.
When you have a shell directly on the Docker server, you can treat containerized processes just like any other process running on the system. If you’re remote, you might send signals with docker kill
because it’s expedient. But if you’re already logged in to a Docker server for a debugging session or because the Docker daemon is not responding, you can just kill
away like you would normally. Note that unless you kill the top-level process in the container, however, this will not terminate the container itself. That might be desirable if you were killing a runaway process, but might leave the container in an unexpected state if developers on remote systems expect that all the processes are running if they can see their container in docker ps
.
These are just normal processes in many respects, and can be passed the whole array of Unix signals listed in the man page for the Linux kill
command. Many Unix programs will perform special actions when they receive certain predefined signals. For example, nginx will reopen its logs when receiving a SIGUSR1
signal. Using the Linux kill
command, it is possible to send any Unix signal to a container process on the local Docker server.
We consider it to be a best practice to run some kind of process control in your production containers. Whether it be systemd
, upstart
, runit
, supervisor
, or your own homegrown tools, this allows you to treat containers atomically even when they contain more than one process. You want docker ps
to reflect the presence of the whole container and don’t want to worry if one of the processes inside it has died. If you can assume that the presence of a container and absence of error logs means that things are working, it allows you to treat docker ps
output as the truth about what’s happening on your Docker systems. Because containers ship as a single artifact, this tends to be how people think of them. But you should only run things that are logically the same application in a single container.
It is also a good idea to ensure that you understand the complete behavior of your preferred process control service, including memory or disk utilization, since this can impact your container’s performance.
Unlike process inspection, debugging containerized applications at the network level can be more complicated. Unless you are running Docker containers with the host networking option, which we will discuss in “Networking”, your containers will have their own IP addresses and therefore won’t show up in all netstat
output. Running netstat -an
on the Docker server, for example, works as expected, as shown here:
$ sudo netstat -an Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 10.0.3.1:53 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp6 0 0 :::23235 :::* LISTEN tcp6 0 0 :::2375 :::* LISTEN tcp6 0 0 :::4243 :::* LISTEN tcp6 0 0 fe80::389a:46ff:fe92:53 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN udp 0 0 10.0.3.1:53 0.0.0.0:* udp 0 0 0.0.0.0:67 0.0.0.0:* udp 0 0 0.0.0.0:68 0.0.0.0:* udp6 0 0 fe80::389a:46ff:fe92:53 :::*
Here we can see all of the interfaces that we’re listening on. Our container is bound to port 23235 on IP address 0.0.0.0. That shows up. But what happens when we ask netstat
to show us the process name that’s bound to the port?
$ netstat -anp Active Internet connections (servers and established) Proto ... Local Address Foreign Address State PID/Program name tcp ... 10.0.3.1:53 0.0.0.0:* LISTEN 23861/dnsmasq tcp ... 0.0.0.0:22 0.0.0.0:* LISTEN 902/sshd tcp6 ... :::23235 :::* LISTEN 24053/docker-proxy tcp6 ... :::2375 :::* LISTEN 954/docker tcp6 ... :::4243 :::* LISTEN 954/docker tcp6 ... fe80::389a:46ff:fe92:53 :::* LISTEN 23861/dnsmasq tcp6 ... :::22 :::* LISTEN 902/sshd udp ... 10.0.3.1:53 0.0.0.0:* 23861/dnsmasq udp ... 0.0.0.0:67 0.0.0.0:* 23861/dnsmasq udp ... 0.0.0.0:68 0.0.0.0:* 880/dhclient3 udp6 ... fe80::389a:46ff:fe92:53 :::* 23861/dnsmasq
We see the same output, but notice what is bound to the port: docker-proxy
. That’s because Docker actually has a proxy written in Go that sits between all of the containers and the outside world. That means that when we look at output, we see only docker-proxy
and that masks which container this is bound to. Luckily, docker ps
shows us which containers are bound to which ports, so this isn’t a big deal. But it’s not necessarily expected, and you probably want to be aware of it before you’re debugging a production failure.
If you’re using host networking in your container, then this layer is skipped. There is no docker-proxy
, and the process in the container can bind to the port directly.
Other network inspection commands work as expected, including tcpdump
, but it’s important to remember that docker-proxy
is there, in between the host’s network interface and the container.
When you’re building and deploying a single container, it’s easy to keep track of where it came from and what images it’s sitting on top of. But this rapidly becomes unmanageable when you’re shipping many containers with images that are built and maintained by different teams. How can you tell what images are actually underneath the one your container is running on? docker history
does just that. You can see the image IDs that are layered into the image and the sizes and commands that were used to build them:
$ docker history centurion-test:latest IMAGE CREATED CREATED BY SIZE ec64a324e9cc 7 months ago /bin/sh -c #(nop) CMD [/bin/sh -c ngi 0 B f38017917da1 7 months ago /bin/sh -c #(nop) EXPOSE map[80/tcp:{ 0 B df0d88d6811a 7 months ago /bin/sh -c #(nop) ADD dir:617ceac0be1 20.52 kB b00af4e7a358 11 months ago /bin/sh -c #(nop) ADD file:76c644211a 518 B 2d4b732ca5cf 11 months ago /bin/sh -c #(nop) ADD file:7b7ef6cc04 239 B b6f49406bcf0 11 months ago /bin/sh -c echo "HTML is working" > / 16 B f384626619d9 11 months ago /bin/sh -c mkdir /srv/www 0 B 5c29c073d362 11 months ago /bin/sh -c apt-get -y install nginx 16.7 MB d08d285012c8 11 months ago /bin/sh -c apt-get -y install python- 42.54 MB 340b0525d10f 11 months ago /bin/sh -c apt-get update 74.51 MB 8e2b3cf3ca53 12 months ago /bin/bash 1.384 kB 24ba2ee5d982 13 months ago /bin/sh -c #(nop) ADD saucy.tar.xz in 144.6 MB cc7385a89304 13 months ago /bin/sh -c #(nop) MAINTAINER Tianon G 0 B 511136ea3c5a 19 months ago 0 B
This can be useful, for example, when determining that a container that is having a problem was actually built on top of the right base image. Perhaps a bug fix was a applied and the particular container in question didn’t get it because it was still based on the previous base image. Unfortunately the ADD
commands show a hash rather than the actual files, but they do show whether it was a directory or a file that was added, which can help you determine which statement in the Dockerfile
is being referred to.
In Chapter 4, we showed you how to read the docker inspect
output to see how a container is configured. But underneath that is a directory on the host’s disk that is dedicated to the container. Usually this is in /var/lib/docker/containers. If you look at that directory, it contains very long SHA hashes, as shown here:
$ ls /var/lib/docker 106ead0d55af55bd803334090664e4bc821c76dadf231e1aab7798d1baa19121 28970c706db0f69716af43527ed926acbd82581e1cef5e4e6ff152fce1b79972 3c4f916619a5dfc420396d823b42e8bd30a2f94ab5b0f42f052357a68a67309b 589f2ad301381b7704c9cade7da6b34046ef69ebe3d6929b9bc24785d7488287 959db1611d632dc27a86efcb66f1c6268d948d6f22e81e2a22a57610b5070b4d a1e15f197ea0996d31f69c332f2b14e18b727e53735133a230d54657ac6aa5dd bad35aac3f503121abf0e543e697fcade78f0d30124778915764d85fb10303a7 bc8c72c965ebca7db9a2b816188773a5864aa381b81c3073b9d3e52e977c55ba daa75fb108a33793a3f8fcef7ba65589e124af66bc52c4a070f645fffbbc498e e2ac800b58c4c72e240b90068402b7d4734a7dd03402ee2bce3248cc6f44d676 e8085ebc102b5f51c13cc5c257acb2274e7f8d1645af7baad0cb6fe8eef36e24 f8e46faa3303d93fc424e289d09b4ffba1fc7782b9878456e0fe11f1f6814e4b
That’s a bit daunting. But those are just the container IDs in long form. If you want to look at the configuration for a particular container, you just need to use docker ps
to find its short ID, and then find the directory that matches:
$ docker ps CONTAINER ID IMAGE COMMAND ... 106ead0d55af kmatthias/centurion-test:latest "/bin/sh -c nginx" ...
You can look at the short ID from docker ps
, then match it to the ls /var/lib/docker
output to see that you want the directory beginning with 106ead0d55af
. If you need exact matching, you can do a docker inspect 106ead0d55af
and grab the long ID from the output. As we discussed in Chapter 5, this directory contains some files that are bind-mounted directly into your container, like hosts
:
$ cd /var/lib/docker/ containers/106ead0d55af55bd803334090664e4bc821c76dadf231e1aab7798d1baa19121 $ ls -la total 32 drwx------ 2 root root 4096 Jun 23 2014 . drwx------ 14 root root 12288 Jan 9 11:33 .. -rw------- 1 root root 0 Jun 23 2014 106ead0d55a...baa19121-json.log -rw-r--r-- 1 root root 1642 Jan 23 14:36 config.json -rw-r--r-- 1 root root 350 Jan 23 14:36 hostconfig.json -rw-r--r-- 1 root root 8 Jan 23 14:36 hostname -rw-r--r-- 1 root root 169 Jan 23 14:36 hosts
This directory is also where Docker stores the JSON file containing the log that is shown with the docker logs
command, the JSON configuration that backs the docker inspect
output (config.json), and the networking configuration for the container (hostconfig.json) are located.
Even if we’re not able to enter the container, or if docker
is not responding, we can look at how the container was configured. It’s also pretty useful to understand what’s backing that mechanism inside the container. Keep in mind that it’s not a good idea to modify these files. Docker expects them to contain reality, and if you alter that reality, you’re asking for trouble. But it’s another avenue for information on what’s happening in your container.
Docker, regardless of the backend actually in use, has a layered filesystem that allows it to track the changes in any given container. This is how the images are actually assembled when you do a build, but it is also useful when trying to figure out if a Docker container has changed anything, and if so, what. As with most of the core tools, this is built into the docker
command-line tooling and is also exposed via the API. Let’s take a look at what this shows us in Example 8-1. We’ll assume that we already have the ID of the container we’re concerned with.
$ sudo docker diff 89b8e19707df C /var/log/redis A /var/log/redis/redis.log C /var/run A /var/run/cron.reboot A /var/run/crond.pid C /var/lib/logrotate.status C /var/lib/redis A /var/lib/redis/dump.rdb C /var/spool/cron A /var/spool/cron/root
Each line begins with either A or C, which are just shorthand for added or changed. We can see that this container is running redis
, that the redis log is being written to, and that someone or something has been changing the crontab
for root. Logging to the local filesystem is not a good idea, especially for anything with high-volume logs. Being able to find out what is writing to your Docker filesystem can really help you understand where things are filling up, or give you a preview of what would be added if you were to build an image from it.
Further detailed inspection requires jumping into the container with docker exec
or nsenter
and the like in order to see what is exactly in the filesystem. But docker diff
gives you a good place to start.
3.142.252.87