© Anthony E. Nocentino, Ben Weissman 2021
A. E. Nocentino, B. WeissmanSQL Server on Kuberneteshttps://doi.org/10.1007/978-1-4842-7192-6_2

2. Container Fundamentals

Anthony E. Nocentino1   and Ben Weissman2
(1)
Oxford, MS, USA
(2)
Nürnberg, Bayern, Germany
 

Containers are changing the way applications are deployed. In this chapter, we will begin with the benefits of container-based application deployment and lay a solid technical foundation of container fundamentals introducing operations such as creating and running containers and persisting data. The chapter will close with the need for container orchestrators and introduce Kubernetes and its benefits. The goal of this chapter is, if you have never seen a container before, to become proficient in container basics before moving on to container orchestration with Kubernetes.

Container-Based Application Deployment

A container is a form of operating system virtualization. For years now, database professionals have become familiar with the concepts of machine virtualization where operating systems are multiplexing the hardware resources of our physical servers, the CPU, memory, and disk. In containers, the underlying operating system, its kernel, and resources are being multiplexed, or shared by applications running on that system. Each container thinks it is the only process running on the operating system. The operating system, in turn, controls access to the underlying hardware as normal. We will explore this isolation concept in more detail shortly. The software that has the responsibility of coordinating this work with the underlying operating system is called a container runtime.

A container is a running container image. A container image contains the binaries, libraries, and file system components to run our application. So, when the container starts up, it will begin executing the defined executable inside it and then has access to the resources of the operating system, in terms of creating additional processes and performing disk or network I/O and so on. Figure 2-1 shows the relationship between a container and its application.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig1_HTML.jpg
Figure 2-1

A containerized application

Conventionally, there is only one application inside a container, because the application is the unit of work and also our unit of scaling. When a container starts, it will begin executing a defined application inside it. There are scenarios where you can put multiple applications inside containers if there is a very tight relationship between those applications, for example, an application server and a metrics data collector.

Containers provide isolation . A process running inside a container cannot see any other processes running on the operating system or even processes running inside other containers. This concept is key to the portability and usability of and frankly the success of containers.

Containers also can tie specific libraries to an application, helping you solve application library conflicts. Have you ever had an application that needed to be installed on a dedicated server because it required a specific version of a DLL or library and that version conflicted with another version perhaps supporting a different application? Containers can save you from having to do that. If a container has the required libraries available inside the container image, when loaded, they are isolated to that running container. Additional containers can be started up with potentially conflicting libraries, and those container-based applications will happily run in isolation of each other.

The isolation from using containers provides portability in upgrading. You can upgrade a library inside a container without impacting other applications running in containers on your system. In Figure 2-2, you can see multiple application containers running on a physical or virtual machine sharing the base operating system. These containers’ executions are completely isolated from each other. If they need to communicate, they must do so over the network.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig2_HTML.jpg
Figure 2-2

Container-based application deployments

Containers are ephemeral , and this ephemerality is one of the superpowers of containers. When a container is up and running, the container has state in terms of the actual program state and any file data changed inside it. A container can also be deleted, and when deleted. any program state and file data inside the container is deleted.

The ephemerality of containers is key to the concept of how container-based applications are deployed and maintained. Decoupling configuration and state from the container lifecycle itself is a core foundation of containers and also container orchestration. Techniques for decoupling configuration and state for containers are introduced later in this chapter with environment variables and Volumes and later in the book with Kubernetes constructs to help us achieve the same goals.

What’s So Hard About Virtual Machines?

Virtual machines have been strongly seated in enterprise IT as the platform of choice for about the last 20 years. We challenge you, the reader, to think what did virtualizing hardware gain you in your data center. You got better utilization of your hardware… That’s great. But what’s the cheapest thing in your data center? Your hardware. What’s the most expensive thing in your data center? You! Your time is the most expensive resource. When using virtual machines as our platform, there is little to no operational efficiency added to our organization, because virtual machines do not add to optimizing an organization’s most expensive resource, the people.

Figure 2-3 is the traditional implementation of virtual machines in a data center. Operations teams build the infrastructure, install the guest operating systems, and install all the applications on top of those OSs, and that’s the production environment. Operations teams very much so put an enormous amount of effort in keeping the systems and the applications properly functioning in this architecture.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig3_HTML.jpg
Figure 2-3

A traditional implementation of VMs and their applications in an enterprise data center

Here are some of the challenges of deploying applications with virtual machine–based platforms:
  • Operating system resource overhead: Running VMs has an inherent CPU and memory overhead. This CPU time and memory could be better spent supporting applications rather than operating systems.

  • Operating system patching: Updating operating systems adds very little business value to your organization. It certainly is required to maintain a proper security posture but does not move your business forward.

  • Troubleshooting: For years and years, systems were built, rolled out into production, and were left alone. If something broke, IT operations had to put on their capes and fix the system.

  • Operating system upgrades: We think the hardest thing in IT to do is upgrading an operating system, because if you upgrade your OS, what do you have to test? EVERYTHING! This tight coupling of application to OS means each time a change is made to the base OS, it injects risk into our system.

  • Deployments: End-to-end automated deployments of VMs and their applications are rare in enterprise IT. Moreover, these solutions are often custom-built, point solutions, which can be hard to maintain.

Do any of the challenges of running virtual machine–based platforms mentioned move your business forward? Is anything gained by using virtual machines? We don’t think so…and perhaps there’s a better way.

Containers

When using container-based application deployments, containers directly attack some of the challenges identified when deploying applications on virtual machine–based platforms. Let’s look at what containers bring to the table:

  • Speed : When compared with VMs, containers are significantly smaller. For example, a virtual machine with SQL Server installed on Windows at a minimum will be 60+GB before you include any user databases. The current container image for SQL Server is about 1.5GB in size. Moving a 1.5GB container image around a modern data center is relatively trivial. Deploying a 60+GB VM can take some time.

  • Patching: When it comes to patching an application, patching is a separate process from deployment. You’ll likely need additional tooling to do so. Leveraging containers, you can very quickly update your application by simply pulling a new container image for your application and starting a new container up on that newer version of your application. If configuration and state are properly decoupled from the container, our app can pick up and start functioning again on the newer version with little to no impact on the application users.

  • Troubleshooting: Due to the ephemerality of containers, a primary troubleshooting technique for container-based applications is to kill and redeploy the container. Since the container image is a known good starting point for our program’s state, one can simply restart a container and get back into a good state.

  • Operating system upgrades: When moving between versions of an operating system, a container can be deleted and recreated on the newer version of the OS. Since an application’s required libraries are contained within the container, risk is reduced when moving between versions of the operating system.

  • Fast and consistent deployments: When using container-based application deployment, deployments are written in code. Efficiencies are gained in how applications are deployed and maintained. In terms of speed, there is no longer a reliance on humans for the work and also consistency, since there is code that represents the state of the system, which can be used repeatedly in deployment processes. This code is placed into source control and is the configuration artifact for the desired state of the system.

Deployment automation is no longer going to be an afterthought or something to strive for in enterprise IT; it will be the primary way applications are deployed – using source-controlled code defining the desired state of the system. Container-based deployment techniques provide IT organizations the ability to provide services to the business more quickly and consistently and enable IT to maintain infrastructure and applications more easily, adding to organizations’ operational efficiency. Application deployment and maintenance can get done faster and more confidently.

Both Docker and Kubernetes enable IT organizations to write code representing a system’s desired state. This code can then be updated effecting the desired changes to applications, platforms, and systems. Code can be written for initial deployments, applying updates, and patching container-based applications. These techniques can also be used to enable troubleshooting with greater efficiency and if needed build self-healing applications. Each of these concepts will be further explored in much detail later in the book.

The Container Universe

OK, so now that you are familiar with the definition of a container and how it fits into modern application deployment processes, let’s look at the container universe. There are a lot of emerging technologies and techniques, and we want to spend some time here familiarizing you with the names and players in this space.

The following list shows some of the names and players in the container universe:
  • Docker: In today’s container space, Docker is a technology more than anything. It is a container runtime and collection of tools that enables you to create and run container images and containers on an operating system, sharing the resources of that OS.

  • Docker Inc.: This is the company that built the tooling and drove the technology to enable containers. Docker Inc. has open sourced the core technologies behind their container runtime and have spun off several open source projects such as containerd (https://containerd.io/), Open Container Initiative (www.opencontainers.org/), and more.

  • containerd: Is a container runtime that coordinates the lifecycle functions of containers such as pulling container images and creating, starting, and stopping containers. containerd is used by Docker and Kubernetes among others to coordinate container lifecycle functions. In Kubernetes, the container runtime is a pluggable component. containerd is the de facto standard.

  • Other container runtimes: The world of containers isn’t all Docker on Linux. There are some other players in the game. Here is just a small sample of the other container runtimes available:
Note

In this chapter, we will use Docker as the container runtime for our single-container deployment scenarios. In later chapters, we will use containerd as the container runtime in our Kubernetes Clusters.

Getting and Running Containers

Let’s talk about what a container image is, how a container image is defined, and where container images live.

The following list highlights the key elements of container images:
  • Container image: Contains the code, application binaries, libraries, and environment variables to run our application. In the most basic terms, these are the things needed to run our application. A running container image is called a container.

  • Docker file: Defines the elements of a container image. It tells the container runtime which binary to start up when the container starts, which network ports to expose, and other critical information about the container image to be built.

  • Container registry: This is where images are stored. Docker Hub is one of many container registries and is a primary place to store and exchange container images. Repositories are ways to organize container images within a container registry.

The Container Lifecycle

Following along in Figure 2-4, you’ll see a container-based application’s lifecycle. When a developer is ready, they will build their application in their normal application development platform. Then they will write a Docker file for that application. This Docker file contains the needed information to build a container image for that application. It will have information like which binary to start up when the container starts up and which network port the application lives on, among many other possible configuration attributes and instructions to build the image. Once the Docker file is ready, the developer will tell Docker to build an image. This will take the defined information from the Docker file and create a container image locally on the developer’s workstation. That container image is then pushed (uploaded) into a container registry where it will sit until someone is ready to use that container image. When a user wants to start up a container from that container image, they will pull that container image down to their operating system, the container runtime on that OS will then create (run) a running container from the container image, and then the application is up and running in a container on that OS.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig4_HTML.jpg
Figure 2-4

A container lifecycle

OK, so enough talk. Let’s see how you can deploy SQL Server in a container on Docker. In this book, you will not be building container images. You will be using images available in public container registries. In this chapter, you will be working with SQL Server containers, and those images are available from the Microsoft Container Registry (mcr.microsoft.com).

Working with Container Images

To pull a container, execute the docker pull command and specify the container image you want to pull. In the following example, the container image is coming from the container registry mcr.microsoft.com from the repository mssql/server, and to ask for a specific container, you specify the image tag, which here is 2019-latest. Listing 2-1 shows this command.
docker pull mcr.microsoft.com/mssql/server:2019-latest
Listing 2-1

docker pull command for latest SQL Server 2019 image

Figure 2-5 shows the resulting output of the command in Listing 2-1.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig5_HTML.jpg
Figure 2-5

Output of docker pull

In the preceding example, a container image with the tag 2019-latest conventionally is being pulled. The maintainer of the container image repository defines a tag that points to the most recent version of their application by specifying a latest tag. If you want to pull a specific version of a container image, you will want to get a list of available tags from the repository. For SQL Server, you can do that with the commands shown in Listing 2-2 (Bash) and Listing 2-3 (Windows).
curl -sL https://mcr.microsoft.com/v2/mssql/server/tags/list
Listing 2-2

Command on Bash

(Invoke-WebRequest https://mcr.microsoft.com/v2/mssql/server/tags/list).Content
Listing 2-3

Command on Windows

Figure 2-6 shows you a part of these commands’ outputs. There are many more container images available, but we’ve omitted some for brevity.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig6_HTML.jpg
Figure 2-6

Abbreviated list of container images and their tags

If you want to pull a specific container, you need to specify a container image tag. The command in Listing 2-4 will pull the container image associated with the tag 2019-CU9-ubuntu-18.04.
docker pull mcr.microsoft.com/mssql/server:2019-CU9-ubuntu-18.04
Listing 2-4

docker pull command to pull container image associated with specific tag

To get a list of images available on a local system, execute the docker image ls command in Listing 2-5. The command’s output shows the images that have been pulled to the local system. The following is an example.
docker image ls
Listing 2-5

docker image ls command

For each image, the output (Figure 2-7) shows the image’s repository, the tags, the image identifier (the IMAGE ID), the creation date, and the image size.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig7_HTML.jpg
Figure 2-7

Output of docker image ls command

It is a common misconception that the creation date shown by the docker image ls command is the date on which the image was pulled. That’s not the case. The creation date really is the date on which the image was created.

A container image can have multiple tags. In the preceding output, if you look closely at the container IMAGE ID, you will notice that both container images have the same value for IMAGE ID. The tags 2019-latest and 2019-CU9-ubuntu-18.04 point to the same container image because the latest image at the time of this writing for SQL Server 2019 is CU9. When a new container image is published into the repository, it will have a new, unique container image ID. The repository administrator will update the latest tag to point to this newest image in that repository.

Starting a Container

To start a container, execute the docker run command like in Listing 2-6, and let’s walk through the following example.
docker run
    --env 'ACCEPT_EULA=Y'
    --env 'MSSQL_SA_PASSWORD=S0methingS@Str0ng!'
    --name 'sql1'
    --publish 1433:1433
    --detach
    mcr.microsoft.com/mssql/server:2019-CU9-ubuntu-18.04
Listing 2-6

docker run command

To run SQL Server in a container, a couple things are required to configure SQL Server for its initial startup. As discussed earlier, decoupling configuration and state is key to running applications in containers. Here is an example of decoupling configuration. SQL Server exposes configuration points as environment variables. And you can inject configuration at runtime by specifying values for those environment variables. In the preceding command, you see --env 'ACCEPT_EULA=Y'. This specifies the value 'Y' for the environment variable ACCEPT_EULA. At startup, SQL Server will look for this value and start up accordingly. Similarly, defined is an environment variable 'MSSQL_SA_PASSWORD=S0methingS@Str0ng!'. This sets the sa password at container startup, which in this case is S0methingS@Str0ng!. While not required, a container name is specified with the --name='sql1' parameter, which is useful when working with containers at the command line and gives us the ability to address the container by its name.

Tip

For more information on configurations available as environment variables, check out https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-environment-variables.

In addition to application configuration and name, to access the container-based application over the network, a port must be exposed. The parameter --publish 1433:1433 exposes a port from inside the container to one outside the container on the base operating system. Let’s unpack this a bit, as this is one place we tripped up often when we got started with containers. The first 1433 is the port on which the application is listening on the base operating system. By default, it will listen on the IP address of the host OS, so this is how users and other applications will access the container-based application either locally on the same host or remotely from other hosts. The second 1433 is the port listening “inside” the container. More on this later when in the discussion on container internals. Next is --detach, which tells the container runtime to detach the running process from standard out. This gives us control of our terminal back and runs SQL Server as a background process.

Note

If you are having trouble starting a container, remove the --detach parameter so you can see the container’s log on the screen streamed to standard out. In SQL Server containers, this is the SQL Server Error Log. The most common reason we see when creating a container is the sa password is not complex enough; this will surface quickly when looking at the Error Log. docker logs is also helpful in this scenario.

And finally, the specific container image to start this container from, and in this example, it is mcr.microsoft.com/mssql/server:2019-CU9-ubuntu-18.04.

If the docker run command is successful, it will print the container ID to standard out.

Execute docker ps in Listing 2-7 to list the containers running on a local system.
docker ps
Listing 2-7

docker ps command

Figure 2-8 shows the command’s output. The sql1 container is up and running. It also shows the container ID, which container image it started from, the command started when the container started up, the container name, when the container was created, and the container’s current status, which in this case has been up for 10 minutes.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig8_HTML.jpg
Figure 2-8

List of containers running on a local system

What If Something Goes Wrong?

Use the docker logs command combined with the container name (Listing 2-8), in this case sql1, to get the output from the container. In SQL Server, the output you’ll find here is from the SQL Server Error Log, which will likely hold valuable information on why your container isn’t starting up.
docker logs sql1 | more
Listing 2-8

docker logs command

Accessing a Container-Based Application

In this case, the application is SQL Server, so let’s use the command line utility sqlcmd to access SQL Server. The code in Listing 2-9 shows a query to get the @@VERSION output.
sqlcmd -S localhost,1433 -U sa -Q 'SELECT @@VERSION' -P 'S0methingS@Str0ng!'
Listing 2-9

Command line utility sqlcmd to access SQL Server

In Figure 2-9, you can see a container running SQL Server 2019 CU1, which matches what the container image specified at container startup.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig9_HTML.jpg
Figure 2-9

Container image specified at container startup

Starting a Second Instance of SQL Server

To start a second instance of SQL Server 2019 CU1 as a container, you execute docker run again. The key differences will be a unique container name, in this case sql2, and also a unique port to publish on. In this case, the second SQL Server instance is available on port 1434 as displayed in Listing 2-10. To access this instance of SQL Server, applications will point to that port. In the following command, rather than use the full parameter names as we did in the previous docker run command, we are using abbreviated parameter names.
docker run
     --name 'sql2'
     -e 'ACCEPT_EULA=Y'
     -e 'MSSQL_SA_PASSWORD=S0methingS@Str0ng!'
     -p 1434:1433
     -d mcr.microsoft.com/mssql/server:2019-CU9-ubuntu-18.04
Listing 2-10

docker run command with unique container name

docker ps will again yield a list of the running containers.

The command’s output in Figure 2-10 shows both containers up and running, sql1 and sql2.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig10_HTML.jpg
Figure 2-10

Output of docker ps command

Now that there are two containers up and running, let’s restore a database into one of those containers. In the book downloads, you’ll find a SQL Server database TestDB1.bak and a restore script restore_testdb1.sql.

The contents of restore_testdb1.sql can also be seen in Listing 2-11.
USE [master]
RESTORE DATABASE [TestDB1]
FROM DISK = N'/var/opt/mssql/data/TestDB1.bak'
WITH REPLACE
Listing 2-11

restore_testdb1.sql

Let’s walk through the process of restoring a database, looking inside the container to see the file layout, and then go through the lifecycle of running a container.

Restoring a Database to SQL Server Running in a Container

The command in Listing 2-12 copies an existing database backup into a container at the directory /var/opt/mssql/data inside the sql2 container, and the command in Listing 2-13 then sets the appropriate permissions on that copied backup file.

Due to the nature of the non-root SQL Server container in SQL Server 2019 (https://techcommunity.microsoft.com/t5/sql-server/non-root-sql-server-2019-containers/ba-p/859644), the permissions of the file copied into the container need to be adjusted so the sqlservr process inside can read the file copied. On Linux systems, the command docker cp will copy the file as the UID of the user from the base operating system executing the docker cp command. The sqlservr process inside the container runs as the user mssql. The following chown command changes the ownership of the backup file to the user mssql so the user can read the file.
docker cp TestDB1.bak sql2:/var/opt/mssql/data
Listing 2-12

docker cp command

docker exec -u root sql2 chown mssql /var/opt/mssql/data/TestDB1.bak
Listing 2-13

chown command inside the container

With that file in the correct location, execute the restore_testdb1.sql script, which contains the required T-SQL to restore this database. Notice we’re running this restore (Listing 2-14) from outside the container and using sqlcmd on a client workstation and pointing it to the correct server name (localhost) and port 1434.
sqlcmd -S localhost,1434 -U sa -i restore_testdb1.sql -P 'S0methingS@Str0ng!'
Listing 2-14

Execute restore script through sqlcmd

sqlcmd will confirm the successful restore as shown in Figure 2-11.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig11_HTML.jpg
Figure 2-11

Successful restore of a database

With the database restored, let’s use the docker exec command in Listing 2-15 to get access to the inside of the container. This will allow us to explore the internals of our running container. The parameter -it gives you an interactive terminal for the process being executed, which in this case is /bin/bash, a bash prompt. In the following example, the prompt presented shows the username of the logged-in user, mssql, and the hostname of the container that matches the container ID of the container.
docker exec -it sql2 /bin/bash
Listing 2-15

docker exec command

With this interactive bash shell that is running inside the container, let’s look around a bit. Execute a ps -aux command as shown in Listing 2-16 to list all processes running.
ps -aux
Listing 2-16

pa -aux command

In the output shown in Figure 2-12, you will see there is only a small set of processes running inside the container: two sqlservr processes, a bash shell, and the ps command. This example highlights the isolation a container has at runtime. This container and its running processes cannot see any other processes running on the base operating system.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig12_HTML.jpg
Figure 2-12

Output of pa -aux command

Now, execute the directory listing (Listing 2-17) with ls -la /var/opt/mssql/data. This is the default database directory for SQL Server on Linux.
ls -la /var/opt/mssql/data
Listing 2-17

Directory listing

As you can see in Figure 2-13, you will find the system and user databases in this directory. You will also find the database backup file you copied in the preceding demo, the TestDB1.bak file. Each container has a separate file system. So the files in this directory are only available to this running container. If this container is deleted, these files will be deleted with the container. We will introduce data persistency for containers shortly.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig13_HTML.jpg
Figure 2-13

Default database directory for SQL Server on Linux

To exit this container, use the exit command and return to our base OS’s shell.

Stopping a Container

A container running a daemon process like SQL Server will continue to run until it is told to stop. To stop a running container, execute the docker stop command in Listing 2-18 and specify the container name or the container ID. In the following example, the container sql2 is being stopped. This will send a SIGTERM signal to the processes running inside the container to gracefully shut down.
docker stop sql2
Listing 2-18

docker stop command

Finding Containers on a Local System

At this point, there are two containers on the local system. One container is currently stopped, sql2, and one is still running, sql1. Now execute a docker ps command.

In the output (Figure 2-14), there is only one running container, sql1.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig14_HTML.jpg
Figure 2-14

Output of docker ps command. List of running containers

To see all of the containers on a system regardless of their current state, stopped or running, execute the docker ps -a command in Listing 2-19.
docker ps -a
Listing 2-19

docker ps -a command

In the output displayed in Figure 2-15, both containers are listed, sql1 and sql2. The key piece of information is the STATUS column. sql1 is still up and running, as indicated by a status value 16 minutes ago. For the other container, sql2, the status is Exited (0) 33 seconds; it is currently stopped. The 0 is the exit code from the application. A non-zero exit code indicates an error occurred inside the program; a zero (0) indicates a graceful shutdown.

Note

If you find a non-zero exit code, something went wrong, and you will want to use docker logs to investigate the issue for that container.

../images/494804_1_En_2_Chapter/494804_1_En_2_Fig15_HTML.jpg
Figure 2-15

Output of docker ps -a command. List of all containers on the respective system

Starting an Existing Container

Since the container is still on the system, it can be restarted with docker start (Listing 2-20) and then specifying the container name. All of the state and configuration will still be there for this container, so our system and user databases will be there when the container is started up again.
docker start sql2
Listing 2-20

docker start command

We can then use sqlcmd to list the databases in our instance as shown in Listing 2-21.
sqlcmd -S localhost,1434 -U sa -Q 'SELECT name from sys.databases' -P 'S0methingS@Str0ng!'
Listing 2-21

List databases using sqlcmd

In the output in Figure 2-16, sql2 is started up again, showing the current databases on the system including the restored user database TestDB1.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig16_HTML.jpg
Figure 2-16

List of databases

Now, let’s clean up and stop these containers using Listing 2-22.
docker stop sql1
docker stop sql2
Listing 2-22

docker stop command

Removing a Container

When the containers are stopped, docker rm (Listing 2-23) will remove a container from a system. This is when the data inside the containers will be destroyed. So the restored TestDB1 is gone once these containers are removed. Data persistency independent of the lifecycle of the container is covered in the next section. In the following example, both sql1 and sql2 are deleted.
docker rm sql1
docker rm sql2
Listing 2-23

docker rm command

The preceding example deleted the containers, but not the container images. The container images are still on the local system and can be used to source new containers from. Executing docker image ls (Listing 2-24) shows the system the container images are on.
docker image ls
Listing 2-24

docker image ls command

In Figure 2-17, you can see the container images are still on the local system.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig17_HTML.jpg
Figure 2-17

Output of docker image ls command. Container images on the local system

Container Internals

Now, we want to take some time to look at container internals so that you can understand how the operating system implements and provides the isolation to processes and their resources when running inside containers.

A container is a running process with an isolated view of the underlying operating system and its resources. When a container is started from a container image, the container runtime is instructed to start a specific process that is defined in the container image. Also defined is which port the application is listening on, among other configuration information.

As shown earlier, a process listing executed inside the container shows only the processes running inside the container; no other processes on the system are visible. And even though the application is listening on port 1433 inside the container, to access the application, a unique port on the base operating system must be published… How can the operating system provide this isolation for our container-based applications on a single system? This is where Linux namespaces come in.

Namespaces

Linux kernel namespaces (http://man7.org/linux/man-pages/man7/namespaces.7.html) are a kernel construct that provides isolation for processes running on Linux. There are six core namespaces available in Linux, five for resource isolation and one for resource governing. Looking at the following list of namespaces, you can get a feel for what namespaces provide. They provide isolation for programs and the resources the programs are using from the base operating system – things like processes, files, networking, and more.

The five resource isolation namespaces provide processes running on the system access to the services of the underlying operating system:
  • PID: Process isolation

  • MNT: File system and mountpoint isolation

  • NET: Network device and stack isolation

  • IPC: Interprocess communication

  • UTS: Unique hostname and domain naming

The sixth namespace, cgroups, provides resource isolation for processes running on the base operating system. This enables multiple processes to run on the operating system and gives the administrator control over resource sharing:
  • cgroups: Control Groups enable allocating and controlling access to system resources, like CPU, I/O, and memory.

For more information about how cgroups work, check out this link to the Linux man page: http://man7.org/linux/man-pages/man7/cgroups.7.html.

Union File System

A container image is read-only. When a container is running, any changes to files inside the container are written to a writeable layer using a copy on write technique. The Union File System takes the container image’s base layer and the writeable layer and presents both back to the application as a single unified file system. This technique enables us to start many containers from a single image and gain the efficiencies of reusing that container image’s layer as the starting point for many containers. Each container will have a unique writable layer that has a lifecycle tied to the container. When a container is deleted, this writable layer is deleted too. Which, if you are running a stateful application like SQL Server, does not sound too appealing. Techniques to provide data persistency to our container-based applications are coming up in the next section. The implementation details of Docker’s Union File System have changed over the years from AUFS, UnionFS, and OverlayFS, but the implementation details are out of scope for this conversation.

Note

If you want to dig further into how container images work, we encourage you to check out our colleague Elton Stoneman’s (@EltonStoneman) Pluralsight course “Handling Data and Stateful Applications in Docker” (https://app.pluralsight.com/library/courses/handling-data-stateful-applications-docker/table-of-contents).

Data Persistency in Containers

Containers are ephemeral, meaning when a container is deleted, it goes away…for good. In the preceding section, we introduced that as data changes inside a running container, it is written into a writeable layer and the Union File System has the responsibility of joining the layers together to present a single unified file system to the container-based application and when a container is deleted, the writeable layer is deleted as well. So can containers have data persistency across their lifecycle, from creation to deletion and creation again? You might also be asking, why would we need to delete a container? Shouldn’t we just be able to keep it up and running? Well yes, you can keep a container up and running, but if you need to change out the base container image (perhaps you have a new container image for your application due to an upgrade or some sort of patching), you will need to delete the existing container and start a new container using that new container image.

Docker Volumes

A Docker Volume (https://docs.docker.com/storage/) is a Docker managed resource that is independent of the lifecycle of the container. A Docker Volume allocates storage from the underlying operating system or shared storage and presents that storage into the container at a particular location inside the file system of the container.

Tip

Check out the Docker documentation for information on storage drivers by visiting https://docs.docker.com/storage/storagedriver/.

With a Volume mounted at a location inside the file system in the container, as the application changes data, it is written into that file system location and will be written to the Volume. This Volume is back-ended by storage that’s outside of the container. Now if this container is deleted, the container and its writeable layer are still deleted, but the Volume remains since it has a lifecycle independent of the container. So all changes to other parts of the file system, not backed by a Volume, will not be persisted. But files written to the file location backed by the Volume will be persisted. If a new container is created and the Volume is mounted in the container, the data stored in the Volume is accessible inside the new container. Figure 2-18 shows a container-based application, sql1, accessing a Volume named sqldata1.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig18_HTML.jpg
Figure 2-18

A container with a Volume attached

Let’s look at some code to define a Docker Volume for a SQL Server container.

Creating a Container with a Volume

The code in Listing 2-25 shows a container starting up similar to our previous examples. The key difference is a data Volume is specified with the -v or --volume parameter.
docker run
    --name 'sql1'
    -e 'ACCEPT_EULA=Y'
    -e 'MSSQL_SA_PASSWORD=S0methingS@Str0ng!'
    -p 1433:1433
    -v sqldata1:/var/opt/mssql
    -d mcr.microsoft.com/mssql/server:2019-CU9-ubuntu-18.04
Listing 2-25

docker run command – data Volume specification

Let’s unpack that line of code there… -v specifies the configuration of a Volume. This creates a named Volume sqldata1, which will allocate a Volume from the underlying operating system’s file system. The exact location is specific to the container runtime’s platform, Windows, Linux, or MacOS. After the colon, you’ll define where you want the Volume mounted inside the container, so this Volume is mounted at /var/opt/mssql, which is SQL Server’s default instance directory in SQL Server on Linux containers. Inside this directory, you’ll find the data files needed by SQL Server, such as the SQL Server Error Log, Trace files, Extended Event files, and system and user databases. Any data that’s written into /var/opt/mssql is going to be written into the Volume, which is a resource independent of the container.

Note

SQL Server’s binaries live in another part of the file system at /opt/mssql/bin. So, when a container image is replaced with a newer version of SQL Server, the new binaries will be used to start up the container, and our data will be read from /var/opt/mssql, which will persist between container instantiations.

So let’s see this in action and run through a series of demos using SQL Server and Docker Volumes where the following key points will be highlighted. First, starting up a container with a Volume mounted at /var/opt/mssql inside the container and restoring a database. Next, deleting that container. Then, creating a new container that uses that same Volume and finally observing that our data persists independent of the lifecycle of this container. Let’s get started.

In Listing 2-26, a container is defined with a Volume, sqldata1. This Volume is mounted in the file system of the container at /var/opt/mssql, so let’s run this command.

Now, with the container up and running, copy a database backup into the container and set the appropriate permissions on the backup file. Then restore the database using the same process as the previous section. The code example (Listing 2-26) highlights these three steps.
docker cp TestDB1.bak sql1:/var/opt/mssql/data
docker exec -u root sql1 chown mssql /var/opt/mssql/data/TestDB1.bak
sqlcmd -S localhost,1433 -U sa -i restore_testdb1.sql -P 'S0methingS@Str0ng!'
Listing 2-26

docker cp command

sqlcmd will again confirm the execution.

With the container up and running and the user database restored, check out the list of current databases on this instance of our SQL Server container to confirm the database restore was successful (Listing 2-27).
sqlcmd -S localhost,1433 -U sa -Q 'SELECT name from sys.databases' -P 'S0methingS@Str0ng!'
Listing 2-27

List all databases through sqlcmd

In Figure 2-19, the output is shown. TestDB1 is listed in the set of databases on this SQL Server instance.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig19_HTML.jpg
Figure 2-19

List of databases on SQL Server instance

This container has a Volume attached and mounted at /var/opt/mssql in its file system. When SQL Server starts up for the first time, it places critical instance files and system databases into this directory. The user database restored in the previous example is restored into the subdirectory /var/opt/mssql/data based on the code in the restore script (Listing 2-28).
sqlcmd -S localhost,1433 -U sa -Q 'SELECT name, physical_name from sys.master_files' -P 'S0methingS@Str0ng!' -W
Listing 2-28

List all files and their physical names through sqlcmd

By querying sys.master_files, you can see (Figure 2-20) that all of the file locations for our databases are in /var/opt/mssql/data, which is contained within our Volume.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig20_HTML.jpg
Figure 2-20

List of files and their locations

Note

The default user database and log file locations are configurable as environment variables.

Check out https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-environment-variables?view=sql-server-ver15 for more details on that. This topic will be examined further in Chapter 7.

The following commands will stop the container sql1 and remove it (Listing 2-29). This normally would destroy the data associated with this container…but it is now using a Volume.
docker stop sql1
docker rm sql1
Listing 2-29

Stop and remove the container sql1

Listing 2-30 creates a new container, and again it defines a Volume sqldata1. This is the same Volume used in the previous example and is a resource independent of the container, which you can see using the command docker volume ls. SQL Server’s instance directory and its system and user databases are inside this Volume. So, when SQL Server starts up, it will find the master database in /var/opt/mssql/data and then read configuration and state of the instance. Any defined user databases that are available inside the file system will also be brought online.
docker run
    --name 'sql2'
    -e 'ACCEPT_EULA=Y'
    -e 'MSSQL_SA_PASSWORD=S0methingS@Str0ng!'
    -p 1433:1433
    -v sqldata1:/var/opt/mssql
    -d mcr.microsoft.com/mssql/server:2019-CU9-ubuntu-18.04
Listing 2-30

docker run command – creation of new container

With the container up and running, query the current set of databases using the command in Listing 2-31.
sqlcmd -S localhost,1433 -U sa -Q 'SELECT name from sys.databases' -P 'S0methingS@Str0ng!'
Listing 2-31

List all databases through sqlcmd

And in the output, you can see TestDB1 (Figure 2-21). Now we do want to point out that it’s not only user databases but also the system databases and the other files associated with the instance. So any configuration changes made to the instance will persist as well, for example, instance-level configurations such as Max Server Memory.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig21_HTML.jpg
Figure 2-21

List of databases on SQL Server instance

Our Volume is a resource independent of the container, and you can see that by using the docker volume ls command (Listing 3-32) as shown in the output.
docker volume ls
Listing 2-32

docker volume ls command

Figure 2-22 will show us all of the currently defined Volumes on our system.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig22_HTML.jpg
Figure 2-22

List of Docker Volumes

Looking Deeper into Volumes

Now when working with Docker, one of our favorite commands is docker inspect. This command as shown in Listing 2-33 is used to get more detailed information about a resource, and in the following example, executing docker inspect shows more detailed information about our Volume.
docker volume inspect sqldata1
Listing 2-33

docker inspect command

Let’s walk through some of this output, which you can see in Figure 2-23. First is CreateAt, which is the date and time the Volume was created. Also available is the Driver, which is local. This means it is using the underlying operating system’s file system. Next is Mountpoint; this is the actual path on the base operating system that’s being exposed into the container. So, if you browse to this directory on the underlying operating system where the container is running, you will see the container’s files inside of the Volume, and in our example here, you will find the SQL Server instance’s files and its databases at this location.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig23_HTML.jpg
Figure 2-23

Output of docker inspect – detailed information about Volume

Note

If you’re using Mac or Windows, these files are going to be abstracted away from you. Both Mac and Windows use virtualization technologies, so you can run Linux containers on those platforms. The actual file locations for these will be “inside” the VMs used to provide Linux kernel services to your container runtime. On a native Linux system running Linux containers, you will find these files at the actual file system location defined by Mountpoint.

Stopping and Removing Containers and Volumes

Now that we have highlighted the lifecycle of a container and also how to persist data externally from the container using a Volume, it is time to clean up our resources. We will now show you how to stop containers, delete containers, and also delete Volumes.

Executing docker stop sql1 (Listing 2-34) tells the SQL Server process to stop and then stop the container.
docker stop sql1
Listing 2-34

docker stop command

To delete a container, use docker rm (Listing 2-35) and then the container name, sql1. Since its data is stored in a Volume, a new container can be created again if the desire remains to continue to work with that data.
docker rm sql1
Listing 2-35

docker rm command

But since this demonstration is complete, delete the Volume, with docker volume rm sqldata1 (Listing 2-36). When the Volume is deleted, THIS is when the data will be destroyed. So use this with caution!
docker volume rm sqldata1
Listing 2-36

docker Volume rm command

Modern Application Deployment

Now that we have discussed core container fundamentals like how to start containers, access those applications, and persist data independent of the container’s lifecycle, let’s shift the conversation to how containers are used in modern application deployment scenarios and introduce the need for container orchestrators.

So far in this chapter, we have showed the configuration to start up a container, expose that application on the network, and also attach persistent storage to a container. But how is this done at scale in production systems? Do you want to be logging into servers and typing docker run each time you need to start up a container? Do you want to be tracking which ports your applications are listening/published on? No, implementing that configuration and also tracking what resources are where and how to access those is not a trivial task. This is where container orchestrators come into play.

Let’s start off with an example application stack like the one in Figure 2-24.
../images/494804_1_En_2_Chapter/494804_1_En_2_Fig24_HTML.jpg
Figure 2-24

Example application architecture

There are some basic questions on how this is deployed using containers:
  1. 1.

    How are these container-based applications deployed in our data center, and how are they started up?

     
  2. 2.

    Where do these container-based applications run in our data center and on which servers?

     
  3. 3.

    How do these container-based applications scale, and what if we wanted to scale from 2 to 20 web servers to support a burst in workload?

     
  4. 4.

    How do we consistently deploy this application stack?

     
  5. 5.

    How do we deploy this in another environment for testing or perhaps in another data center or cloud?

     
  6. 6.

    How do we or any of our applications access the services?

     
  7. 7.

    What IPs or DNS names are associated with these applications?

     

Container orchestrators help answer these questions.

The Need for Container Orchestrators

A container orchestrator is software that helps manage the deployment of your container-based applications. Container orchestrators are based on the core concepts of desired state and controllers. Container orchestrators will figure out where to run your workload in a collection of compute resources in your data center or cloud, start those containers up, and keep those containers up and running and in the defined state.

Let’s introduce some of the key functionality of container orchestrators:
  • Workload placement: Given a collection of servers in a data center, selecting which servers to run containers on.

  • Managing state: Starting containers and also keeping them online. If something causes a container-based application to stop or become unavailable, a container orchestrator can react and restart the containers.

  • Speed and consistency of deployment: Code is used to define application deployments. A container orchestrator will deploy what is defined in that code. This code is used to quickly and consistently deploy our applications.

  • Hide complexity in Clusters: A container orchestrator exposes a programmatic API to interact with so users can be less concerned about the physical infrastructure for our applications and more focused on how applications are deployed.

  • Persistent application access endpoints: A container orchestrator will track which services are available and provide persistent access to the services provided by our container-based applications.

There are several different container orchestrators available, and in this book, the focus is on Kubernetes (https://kubernetes.io/), as it has become the standard for open source container orchestrators. Therefore, the remainder of this book is focused on how to build a Kubernetes cluster and deploy SQL Server into that environment.

More Resources

Check out our technical reviewer and container expert Andrew Pruski’s (@dbafromthecold) container blog series. If it has to do with containers, Andrew has likely blogged about it:
Also, make sure to take a look at the resources available by our good friend and all-around SQL Server expert Bob Ward (@bobwardms). SQL Server in containers is SQL Server on Linux. If you want to dive into how SQL Server on Linux works, be sure to check out Bob’s book Pro SQL Server on Linux, which Anthony had the absolute pleasure of being the technical reviewer:

Summary

Kubernetes is a container orchestrator, and in this chapter, we have laid the foundation of how containers work. We showed what a container is and how containers provide application isolation. Containers are used to quickly deploy applications, and in our examples, we ran SQL Server in a container. One of the key concepts in this chapter is the need to decouple configuration and state from a container’s lifecycle, and the core tools for that are environment variables to inject configuration and Volumes to persist state (data) independently of a container’s lifecycle. These are core concepts that will be revisited and leveraged throughout the remainder of the book as you learn how to deploy SQL Server on Kubernetes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.12.172