Chapter 3. Fundamentals of Server Virtualization and Containers

Two different options for running services or software are virtual servers and containers. Virtualization technology set the stage for cloud computing, and containers further transformed the application infrastructure landscape. In this chapter, I’ll share the fundamentals to set the stage for architectural decisions that help enable the build and deployment of tested infrastructure in a repeatable and consistent manner.

Server Virtualization

Server virtualization is the process of creating an environment where multiple operating system instances can run on a single physical server. This environment is commonly called a virtual machine(VM). With VM technology, individuals can install and simultaneously run completely different operating systems on a single computer.

The key to server virtualization is the hypervisor that coordinates the low-level interactions between the VMs and host hardware.

Note

Hypervisors can be specialized hardware, firmware, or software. I’m not digging into the specifics here as there are a variety of specialized and comprehensive resources, depending on your area of focus.

For example, check out Brendan Gregg’s blog post on “AWS EC2 Virtualization 2017: Introducing Nitro” for a deep dive into Nitro, virtualization technology in use on Amazon Elastic Compute Cloud (Amazon EC2) as of 2019.

The configuration file specifies virtual hardware resources including memory, CPU, and storage that are allocated to the VM.

Popular desktop VM software includes:

Years ago, you would purchase hardware to host the services that were needed to power your organization’s requirements. In practical terms, virtualization gave you the ability to use more of the hardware resources that were idling on dedicated servers.

You might have decided to deploy different hardware for CPU and memory intensive database servers than for a popular but static web server. Having different hardware increases complexity in server deployments as well as software configuration on the system. If you decided to standardize the hardware to simplify deployments and configuration, there would be underutilized resources on the web servers.

One way companies leveraged larger standardized hardware was to host different services on the same system, for example installing the web server on the same system as the database server. While minimizing network latency between services located on the same physical hardware, this created resource contention under load. Scaling multi-service systems became more complicated with the potential for hot spots with overused services, and cold spots with too little activity.

Note

Virtualization also gave sysadmins the ability to support legacy applications running on different operating systems, but I’m not focusing on that benefit in this book. Instead, I’m going to focus on how virtualization enabled cloud platforms.

With virtualization, you didn’t have to buy dedicated hardware for each service. The challenge was in implementing virtualization which could be very complex depending on the software. Configuration complexity and companies without the resources for dedicated data centers or hardware created a demand for third party hosted servers.

Hosted servers or virtual private servers(VPS) were provided by hosting companies as a way to share large systems with several clients. VPS customers would request a specific server hardware configuration, operating system, and applications; the hosting service spun up the requested system.

VPS paved the way for Infrastructure as a Service (IaaS). Platform companies improved on the VPS model by making virtual systems programmatically available, and adding networking and security functionality. Amazon initiated this revolution with Amazon EC2. Instead of just allocating individual VPS with limited operating system support, individuals specified the compute resource needed, specific OS through an Amazon Machine Image (AMI), and networked resources together in public or private subnets.

Infrastructure as a Service (IaaS) offerings from providers leverage VM software including:

Each provider defines its VM instance offerings. As technology evolves, new instances are offered. In addition to staying abreast of operating system features, it’s important to understand these instance offerings for any cloud provider in use.

Tip

Hashicorp Packer is an open source tool for creating identical machine images from a single configuration file for a variety of platforms including Amazon EC2, Microsoft Azure Virtual Machine, Docker, and VirtualBox. This can help you build similar images in a repeatable fashion across providers as well as for local use.

Learn more about Packer in use in the Infrastructure in Practice chapter.

Containers

Separate from virtualization, other technologies evolved isolating applications and processes running on systems. Early Unix systems had the chroot system call that enabled changing the root directory of a process, and all of its children. This provided some process isolation. While originally created for separating testing from live instances, it seeded the technology for FreeBSD jails to allow for more clear cut separation of services.

Starting in 2000, sysadmins could partition a FreeBSD system into multiple instances called jails and assign a hostname and IP address per instance. FreeBSD jails allowed hosting companies to provide VPS for FreeBSD systems.

Resource-isolating containers became an active areas of operating system research. In 2002, Linux kernel 2.4.19 introduced namespaces, which allow resources to be partitioned at the process level. Linux currently supports several namespace types, including user, network, process ID, and Cgroup.

The user namespace isolates user identifiers, group identifiers, the root directory, and keys. This means that effectively within a namespace, a process can have elevated privilege while external from the namespace its a normal unprivileged user.

The network namespace isolates resources associated with networking including network devices, IP routing tables, and firewall rules. This means that processes see different networking interfaces, including the loopback interface!

The PID namespace isolates process IDs allowing processes in different PID namespaces to be assigned the same identifiers without conflicts.

The CGroup namespace isolates the cgroup root directories which helps to confine containerized processes. Process Containers were released in 2006 and subsequently renamed to control groups (cgroups) due to the overuse of the term container. A control group is a group of processes that can be configured with a specific limitation of CPU time, memory utilization, and network utilization, monitored, and modified at run time.

Tip

Read Tejun Heo’s authoritiative documentation on Control Group v2 for a deeper dive into cgroups.

In 2008, Linux Containers (LXC) was released as a way to manage containers combining cgroups and namespaces. In 2013, Docker was released as an open source project that added to the LXC technology with additional container management tools. In 2014, Docker replaced LXC with libcontainer to add additional functionality, and increase stability when interfacing with namespaces and cgroups.

Defining Container

I’ve shared the evolution towards containers and standardization, but what is a container? A container is an isolated process with a portable runtime environment.

Tip

Read Julia Evan’s “What even is a container: namespaces and cgroups” for another way to look at containers.

Popular container technology includes:

Hosted container services that simplify some of the technical overhead of running containers in production include:

  • Amazon Elastic Container Service (ECS)

  • Amazon Elastic Kubernetes Service (EKS)

  • Azure Container Instances

  • Azure Kubernetes Service (AKS)

  • Google Kubernetes Engine

  • RedHat OpenShift

Comparing Virtualization to Containerization

Where server virtualization generally abstracts at the hardware driver level, containers work at the operating system level. As with server virtualization, containers solve the problem of supporting running an application reliably and consistently in an isolated environment with identical dependencies, libraries, binaries, and configurations.

The key difference between the server virtualization and containerization is the isolation level. With virtualization, a single hypervisor may host multiple virtualized operating systems, each with its own independent kernel. With containers, a single Linux kernel may host multiple containerized applications.

Virtual machines are typically several gigabytes, because they need to include a complete runnable operating system image, rather than delegating this to the host platform. On the other hand, containers encapsulate the application, which is often no more than a few megabytes.

Similarly, VMs boot up time is similar to a physical machine as the virtual hardware initializes and the kernel loads, which can take several minutes. Hardware initialization and kernel loading aren’t required by containers, so they can be launched and spun down in seconds as needed.

Containers are thus ideal for running many application instances on the same host operating system, while virtual machines are better suited for running a more heterogenous mix of operating systems on shared host server hardware.

Wrapping Up

In this chapter, I examined virtualization and container fundamentals through examining their impact on cloud computing and the general application infrastructure landscape. You should have a solid definition of virtualization and containers now, as well as understand the technology that makes up containers.

In the next chapter, let’s revisit Docker as a tool with Containers in Practice going beyond just the basics introduced in the local development environment. I’ll introduce Docker objects and architecture.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.186.178