© Senthil Kumaran S. 2017

Senthil Kumaran S., Practical LXC and LXD, https://doi.org/10.1007/978-1-4842-3024-4_1

1. Introduction to Linux Containers

Senthil Kumaran S.

(1)Chennai, Tamil Nadu, India

Computer science is a field that keeps evolving at a very fast pace, requiring everyone in this industry to keep up to date on the latest technological advancements. In recent history, this industry has welcomed new technologies such as distributed computing, parallel processing, virtualization, cloud computing, and, most recently, the Internet of Things (IoT). Each technology paves the way for the next and helps to build a strong foundation for others. For example, virtualization revolutionized and built the basis for cloud computing. It has been common practice to use computers with maximum resource utilization from the beginning when computers were invented, whether via time sharing, multitasking, or the recent virtualization trends.

Since the early 1990s when the Linux kernel came into existence, many operating system distributions have evolved around the Linux kernel. However, until recently, GNU/Linux was used extensively only by advanced users with the skills to configure and maintain it. That has changed with the introduction of user-friendly interfaces by several GNU/Linux vendors, and now GNU/Linux is more widely adopted by consumer users on desktops, laptops, and so forth. With the advent of Linux kernel–powered Android phones, use of Linux has become ubiquitous among a much larger audience.

Containerization is the next logical step in virtualization, and there is a huge buzz around this technology. Containers can provide virtualization at both the operating system level and the application level. This book focuses on the use of containerization technology with the Linux kernel.

Some of the possibilities with containers are as follows:

  • Provide a complete operating system environment that is sandboxed (isolated)

  • Allow packaging and isolation of applications with their entire runtime environment

  • Provide a portable and lightweight environment

  • Help to maximize resource utilization in data centers

  • Aid different development, test, and production deployment workflows

Container Definition

A container can be defined as a single operating system image, bundling a set of isolated applications and their dependent resources so that they run separated from the host machine. There may be multiple such containers running within the same host machine.

Containers can be classified into two types:

  • Operating system level: An entire operating system runs in an isolated space within the host machine, sharing the same kernel as the host machine.

  • Application level: An application or service, and the minimal processes required by that application, runs in an isolated space within the host machine.

Containerization differs from traditional virtualization technologies and offers many advantages over traditional virtualization :

  • Containers are lightweight compared to traditional virtual machines.

  • Unlike containers, virtual machines require emulation layers (either software or hardware), which consume more resources and add additional overhead.

  • Containers share resources with the underlying host machine, with user space and process isolations.

  • Due to the lightweight nature of containers, more containers can be run per host than virtual machines per host .

  • Starting a container happens nearly instantly compared to the slower boot process of virtual machines.

  • Containers are portable and can reliably regenerate a system environment with required software packages, irrespective of the underlying host operating system.

Figure 1-1 illustrates the differences in how virtual machines, Linux Containers (LXC) or operating system–level containers, and application-level containers are organized within a host operating system.

A441185_1_En_1_Fig1_HTML.jpg
Figure 1-1. Comparing virtual machines, LXC or OS-level containers, and application-level containers

Container History

Virtualization was developed as an effort to fully utilize available computing resources. Virtualization enables multiple virtual machines to run on a single host for different purposes with their own isolated space. Virtualization achieved such isolated operating system environments using hypervisors, computer software that sits in between the host operating system and the guest or the virtual machine’s operating system. As mentioned in the introduction, containerization is the next logical step in virtualization, providing virtualization at both the operating system level and the application level.

Container technology has existed for a long time in different forms, but it has significantly gained popularity recently in the Linux world with the introduction of native containerization support in the Linux kernel. Table 1-1 lists some of the earlier related techniques that have led us to the current state of the art.

Table 1-1. Container Technology Timeline

Year

Technology

First Introduced in OS

1982

Chroot

Unix-like operating systems

2000

Jail

FreeBSD

2000

Virtuozzo containers

Linux, Windows (Parallels Inc. version)

2001

Linux VServer

Linux, Windows

2004

Solaris containers (zones)

Sun Solaris, Open Solaris

2005

OpenVZ

Linux (open source version of Virtuozzo)

2008

LXC

Linux

2013

Docker

Linux, FreeBSD, Windows

Note

Some technologies covered in Table 1-1 may be supported in more operating systems than those listed. Most of the technologies are available on various forms of Unix operating system, including Linux.

Some container technologies listed in Table 1-1 have a very specific purpose, such as chroot, which provides filesystem isolation by switching the root directory for running processes and their children. Other technologies listed provide complete operating system–level virtualization, such as Solaris containers (zones) and LXC.

Common modern-day containers are descended from LXC, which was first introduced in 2008. LXC was possible due to some key features added to the Linux kernel starting from the 2.6.24 release, as described in the next section.

Features to Enable Containers

Containers rely on the following features in the Linux kernel to get a contained or isolated area within the host machine. This area is closely related to a virtual machine, but without the need for a hypervisor.

  • Control groups (cgroups)

  • Namespaces

  • Filesystem or rootfs

Control Groups (Cgroups)

To understand the importance of cgroups, consider a common scenario: A process running on a system requests certain resources from the system at a particular instance, but unfortunately the resources are unavailable currently, so the system decides to defer the process until the requested resources are available. The requested resources may become available when other processes release them. This delays the process execution, which may not be acceptable in many applications. Resource unavailability such as this can occur when a malicious process consumes all or a majority of the resources available on a system and does not allow other processes to execute.

Google presented a new generic method to solve the resource control problem with the cgroups project in 2007. Control groups allow resources to be controlled and accounted for based on process groups. The mainline Linux kernel first included a cgroups implementation in 2008, and this paved the way for LXC.

Cgroups provide a mechanism to aggregate sets of tasks or processes and their future children into hierarchical groups. These groups may be configured to have specialized behavior as desired.

Listing Cgroups

Cgroups are listed within the pseudo filesystem subsystem in the directory /sys/fs/cgroup, which gives an overview of all the cgroup subsystems available or mounted in the currently running system:

stylesen@harshu:∼$ ls -alh /sys/fs/cgroup
total 0
drwxr-xr-x 12 root root 320 Mar 24 20:40 .
drwxr-xr-x  8 root root   0 Mar 24 20:40 ..
dr-xr-xr-x  6 root root   0 Mar 24 20:40 blkio
lrwxrwxrwx  1 root root  11 Mar 24 20:40 cpu -> cpu,cpuacct
lrwxrwxrwx  1 root root  11 Mar 24 20:40 cpuacct -> cpu,cpuacct
dr-xr-xr-x  6 root root   0 Mar 24 20:40 cpu,cpuacct
dr-xr-xr-x  3 root root   0 Mar 24 20:40 cpuset
dr-xr-xr-x  6 root root   0 Mar 24 20:40 devices
dr-xr-xr-x  4 root root   0 Mar 24 20:40 freezer
dr-xr-xr-x  7 root root   0 Mar 24 20:40 memory
lrwxrwxrwx  1 root root  16 Mar 24 20:40 net_cls -> net_cls,net_prio
dr-xr-xr-x  3 root root   0 Mar 24 20:40 net_cls,net_prio
lrwxrwxrwx  1 root root  16 Mar 24 20:40 net_prio -> net_cls,net_prio
dr-xr-xr-x  3 root root   0 Mar 24 20:40 perf_event
dr-xr-xr-x  6 root root   0 Mar 24 20:40 pids
dr-xr-xr-x  7 root root   0 Mar 24 20:40 systemd

Memory Subsystem Hierarchy

Let’s take a look at an example of the memory subsystem hierarchy of cgroups. It is available in the following location:

/sys/fs/cgroup/memory

The memory subsystem hierarchy consists of the following files:

root@harshu:/sys/fs/cgroup/memory# ls
cgroup.clone_children                    memory.memsw.failcnt
cgroup.event_control                     memory.memsw.limit_in_bytes
cgroup.procs                             memory.memsw.max_usage_in_bytes
cgroup.sane_behavior                     memory.memsw.usage_in_bytes
init.scope                               memory.move_charge_at_immigrate
lxc                                      memory.numa_stat
memory.failcnt                           memory.oom_control
memory.force_empty                       memory.pressure_level
memory.kmem.failcnt                      memory.soft_limit_in_bytes
memory.kmem.limit_in_bytes               memory.stat
memory.kmem.max_usage_in_bytes           memory.swappiness
memory.kmem.slabinfo                     memory.usage_in_bytes
memory.kmem.tcp.failcnt                  memory.use_hierarchy
memory.kmem.tcp.limit_in_bytes           notify_on_release
memory.kmem.tcp.max_usage_in_bytes       release_agent
memory.kmem.tcp.usage_in_bytes           system.slice
memory.kmem.usage_in_bytes               tasks
memory.limit_in_bytes                    user
memory.max_usage_in_bytes                user.slice
root@harshu:/sys/fs/cgroup/memory#

Each of the files listed contains information on the control group for which it has been created. For example, the maximum memory usage in bytes is given by the following command (since this is the top-level hierarchy, it lists the default setting for the current host system):

root@harshu:/sys/fs/cgroup/memory# cat memory.max_usage_in_bytes
15973715968

The preceding value is in bytes; it corresponds to approximately 14.8GB of memory that is available for use by the currently running system. You can create your own cgroups within /sys/fs/cgroup and control each of the subsystems.

Namespaces

At the Ottawa Linux Symposium held in 2006, Eric W. Bierderman presented his paper “Multiple Instances of the Global Linux Namespaces” (available at https://www.kernel.org/doc/ols/2006/ols2006v1-pages-101-112.pdf ). This paper proposed the addition of ten namespaces to the Linux kernel. His inspiration for these additional namespaces was the existing filesystem namespace for mounts, which was introduced in 2002. The proposed namespaces are as follows:

  • The Filesystem Mount Namespace (mnt)

  • The UTS Namespace

  • The IPC Namespace (ipc)

  • The Network Namespace (net)

  • The Process Id Namespace (pid)

  • The User and Group ID Namespace

  • Security Modules and Namespaces

  • The Security Keys Namespace

  • The Device Namespace

  • The Time Namespace

A namespace provides an abstraction to a global system resource that will appear to the processes within the defined namespace as its own isolated instance of a specific global resource. Namespaces are used to implement containers; they provide the required isolation between a container and the host system.

Over time, different namespaces have been implemented in the Linux kernel. As of this writing, there are seven namespaces implemented in the Linux kernel , which are listed in Table 1-2.

Table 1-2. Existing Linux Namespaces

Namespace

Constant

Isolates

Cgroup

CLONE_NEWCGROUP

Cgroup root directory

IPC

CLONE_NEWIPC

System V IPC, POSIX message queues

Network

CLONE_NEWNET

Network devices, stacks, ports, etc.

Mount

CLONE_NEWNS

Mount points

PID

CLONE_NEWPID

Process IDs

User

CLONE_NEWUSER

User and group IDs

UTS

CLONE_NEWUTS

Hostname and NIS domain name

Let’s examine how namespaces work with the help of a simple example using the network namespace.

Simple Network Namespace

Namespaces are created by passing the appropriate clone flags to the clone() system call. There is a command-line interface for the network namespace that can be used to illustrate a simple network namespace, as follows:

Note

Root privileges are required to create a network namespace.

  1. Create a network namespace called stylesen-net:

    # ip netns add stylesen-net
  2. To list all devices present in the newly created network namespace, issue the following command. This example shows the default loopback device.

    # ip netns exec stylesen-net ip link list
    1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. Try to ping the loopback device:

    # ip netns exec stylesen-net ping 127.0.0.1
    connect: Network is unreachable
  4. Though the loopback device is available, it is not up yet. Bring up the loopback device and try pinging it again:

    # ip netns exec stylesen-net ip link set dev lo up
    # ip netns exec stylesen-net ping 127.0.0.1PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
    64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.045 ms
    64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.059 ms
    64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.097 ms
    64 bytes from 127.0.0.1: icmp_seq=4 ttl=64 time=0.084 ms
    64 bytes from 127.0.0.1: icmp_seq=5 ttl=64 time=0.095 ms
    ^C
    --- 127.0.0.1 ping statistics ---
    5 packets transmitted, 5 received, 0% packet loss, time 4082ms
    rtt min/avg/max/mdev = 0.045/0.076/0.097/0.020 ms

Thus, we can create network namespaces and add devices to them. Any number of network namespaces can be created, and then different network configurations can be set up between the devices available in these individual namespaces.

Filesystem or rootfs

The next component needed for a container is the disk image, which provides the root filesystem (rootfs) for the container. The rootfs consists of a set of files, similar in structure to the filesystem mounted at root on any GNU/Linux-based machine. The size of rootfs is smaller than a typical OS disk image, since it does not contain the kernel. The container shares the same kernel as the host machine.

A rootfs can further be reduced in size by making it contain just the application and configuring it to share the rootfs of the host machine. Using copy-on-write (COW) techniques, a single reduced read-only disk image may be shared between multiple containers.

Summary

This chapter introduced you to the world of container technology with a comparison of containers to traditional virtualization technologies that use virtual machines. You also saw a brief history of container technology and the important Linux kernel features that were introduced to underpin modern container technologies. The chapter wrapped up with an overview of the three basic features (cgroups, namespaces, and rootfs) that enable containerization.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.179.119