© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
B. WardSQL Server 2022 Revealedhttps://doi.org/10.1007/978-1-4842-8894-8_9

9. SQL Server 2022 on Linux, Containers, and Kubernetes

Bob Ward1  
(1)
North Richland Hills, TX, USA
 

In 2016 we shocked the database industry by announcing we would be releasing SQL Server on Linux operating systems (you can see the original blog post by Scott Guthrie at https://blogs.microsoft.com/blog/2016/03/07/announcing-sql-server-on-linux). This announcement led to the release of SQL Server on Linux with SQL Server 2017. We built SQL Server on Linux using an innovative approach with software called the Platform Abstraction Layer (PAL). This allowed us to bring SQL Server on Linux to the marketplace quickly without rewriting the entire engine. The PAL also provided compatibility so that the core engine code remained the same, allowing you to back up and restore databases across operating systems. This means that core engine capabilities like the query processor and columnstore indexes are exactly the same across SQL Server on Windows and Linux. In SQL Server 2019, we shored up most of the features that didn’t exist in SQL Server 2017 including replication, CDC, DTC transactions, machine learning, and data virtualization.

We made sure that our approach using the PAL would allow SQL Server to perform as fast as or faster than SQL Server on Windows (you can read the original PAL approach in a blog post by our engineering ream at https://cloudblogs.microsoft.com/sqlserver/2016/12/16/sql-server-on-linux-how-introduction). TPC-H results submitted by our partners proved the point (you can see examples of these at www.tpc.org/tpch/results/tpch_price_perf_results5.asp?resulttype=noncluster&version=3).

By supporting SQL Server on Linux, we opened up new possibilities with containers and support for Kubernetes. This allowed us to reach new customers and applications, which was never possible before.

SQL Server 2022 continues the support for Linux, containers, and Kubernetes. This chapter is not a deep dive into how SQL Server on Linux works. I actually wrote an entire book on that topic called Pro SQL Server on Linux. In this chapter I’ll cover differences for SQL Server 2022 but also review with you some of the fundamentals on how you can use and optimize SQL Server on Linux, containers, and Kubernetes.

SQL Server 2022 on Linux

SQL Server 2022 on Linux is very similar to SQL Server 2019 with a few minor differences. This includes the Linux distribution versions we support, enhanced support for DTC transactions, and some features we won’t support. For all the latest information on SQL Server Linux, stay connected to https://aka.ms/sqllinux.

What’s New for SQL Server 2022

SQL Server 2022 is officially supported on the following Linux distributions:
  • Red Hat Enterprise Linux 8.0–8.5 Server

  • Ubuntu 20.04 LTS

  • SUSE Linux Enterprise Server (SLES) 15

I often get questions about support for other Linux distributions like CentOS. SQL Server on Linux can run just fine on other Linux distributions, but we only officially support RHEL, Ubuntu, and SLES. These are distributions we have tested, and we have agreements with these companies to provide official production support.

Note

We do not support running SQL Server on Linux on Windows Subsystem for Linux (WSL). However, do we support running SQL Server containers with Docker on Windows including WSL integration.

New Capabilities Specific to Linux

One thing that is new for SQL Server 2022 for Linux is managing DTC transactions. We added DTC support for Linux in SQL Server 2019. However, there is limited support for monitoring and managing transactions, especially when there is an issue. In SQL Server 2022 we were able to add WMI support within the PAL to support this new capability. You can read more at https://docs.microsoft.com/sql/linux/sql-server-linux-configure-msdtc.

Also, as I described in Chapter 6 of the book, we now support T-SQL snapshot backups, so you can perform snapshot backups on Linux without writing a VDI program or relying on programs like Windows VSS.

There are some new enhancements for SQL Server 2022 that we do not support:
  • Azure extension for SQL Server during setup (you can use a different method to install the extension)

  • Multiple TCP connections for Distributed Availability Groups

  • TLS 1.3

  • Intel QAT backup compression

  • Microsoft Purview policies

Note

Synapse Link requires the self-hosted integration runtime (SHIR) as you learned in Chapter 3 of the book. SHIR only works on the Windows OS, but you can direct it to connect to a SQL Server on Linux instance.

This was the list at the time of the writing of this book. It is possible that we have added support for these features by the time we ship SQL Server 2022. Keep up with all the latest details at https://docs.microsoft.com/sql/linux/sql-server-linux-editions-and-components-2022#unsupported-features-and-services.

I received a question when presenting early on about SQL Server 2022 that went like this: “Bob, it doesn’t seem like you are investing much these days on Linux.” My response was that we are. We may not be investing in new features that are specific to Linux, but look at all the rich capabilities in the engine that work on both Windows and Linux including cloud-connected features, built-in query intelligence, Ledger for SQL Server, tempdb, Contained AGs, REST-based Polybase, and new T-SQL enhancements. That is the power of the compatible story we built for SQL Server on Linux.

Deploying SQL Server 2022 on Linux

Deploying SQL Server 2022 on Linux is the same as it has been with SQL Server 2017 and 2019. You choose your supported Linux distribution and use the package managers that come with the Linux distribution to install SQL Server. This is because we post Linux packages for you to use with RHEL, Ubuntu, and SLES. We provide instructions on how to download a repo file, which contains the source on the Web where our packages can be found.

Tip

Did you know you can browse all of our packages at https://packages.microsoft.com?

For example, on RHEL, there are two simple commands to install the core SQL Server engine package:
sudo curl -o /etc/yum.repos.d/mssql-server.repo https://packages.microsoft.com/config/rhel/8/mssql-server-preview.repo
sudo yum install -y mssql-server
Note

The package name will change from mssql-server-preview.repo to a different name like mssql-server-2022.repo when SQL Server becomes generally available. You can stay up to date at https://docs.microsoft.com/sql/linux/quickstart-install-connect-red-hat.

It is a very simple install process, and the core engine package has just about everything you need for the fundamentals all built into these steps.

There are a few unique aspects to SQL Server on Linux deployment:
  • In order to complete the setup, you must run the following command no matter what distribution you are using:

sudo /opt/mssql/bin/mssql-conf setup
mssql-conf is a script that can perform configuration operations outside of the SQL instance. The setup option is used to establish the admin password, choose an edition, and other options to complete the setup process.
  • Unlike SQL Server on Windows, other features may require you to install additional packages such as tools, full-text search, SSIS, Java extension and ML services, Polybase, and HA (for availability groups). You can get a list of all of these packages on the following site under the Package details section: https://docs.microsoft.com/sql/linux/sql-server-linux-release-notes-2022#supported-platforms.

  • One interesting aspect of Linux is that each cumulative update (CU) and General Distribution Release (GDR) is a separate package so you could install a CU directly without installing RTM and then have to apply a CU package (which is required for Windows).

What Else Should I Know?

  • SQL Server on Linux is installed as a service on Linux. Therefore, you can see the status of SQL Server (running, stopped, etc.) at any time with the following command:

systemctl status mssql-server
  • The default directory for data, log, log files, and backups is based on /var/opt/mssql. This directory must always exist, but you can change the default location for your files such as databases and backups.

  • SQL Server on Linux allows you to perform uninstalls, upgrades, offline installs, and automatic installs (using environment variables).

  • Licensing for SQL Server on Linux is the same as on Windows. You can use your existing SQL Server licenses on either operating system.

Ansible with RHEL

One of the nice capabilities unique to Red Hat Enterprise Linux is Ansible. Ansible provides a great platform for automation. Therefore, it can be a solution to automate the installation of many computers or virtual machines for SQL Server on Linux through a concept called a playbook.

We have worked with Red Hat to include a special role for using playbooks with SQL Server on Linux.

How to Connect and Use SQL Server 2022 on Linux

Once you have deployed SQL Server on Linux, you are going to want to connect and start using the instance with T-SQL statements. I’ve included some brief tips in this section of the chapter. If you are looking for exercises for SQL Server on Linux, you can easily use the scripts from Pro SQL Server on Linux (you will need to retrofit package names for SQL Server 2022) from https://github.com/microsoft/bobsql/tree/master/sqllinuxbook.

ssh vs. rdp

For those in the Windows world used to using Remote Desktop (rdp), Linux has a standard remote shell called secure shell (ssh), which is a command line interface into a bash shell on the computer or VM. There are many ways to use ssh to connect to your Linux computer or VM. Here are some ways I use ssh:
  • ssh just comes with PowerShell, so you can use it right from a PowerShell command prompt.

  • If you have deployed in Azure, the Azure Cloud Shell comes with a bash shell interface, and ssh is included.

  • I personally love using the free tool MobaXterm, which you can download from https://mobaxterm.mobatek.net. One thing that is nice about this tool is that there is a graphical interface to upload and download files from your local computer into Linux.

Connecting with Our Tools

Since SQL Server on Linux is a compatible SQL Server engine, you can use your favorite tool that normally connects to SQL Server on Windows such as SSMS.

Azure Data Studio (ADS) has become very popular and works well with Linux. One of the nice advantages of ADS is that it works cross-platform, so it can run on Windows, Linux, or macOS. ADS also comes with the concept of notebooks, which you have seen already in some exercises in this book.

The command line tool sqlcmd works well with SQL Server on Linux. There is a Linux version of the tool, which comes as a separate package to install depending on your Linux distribution.

We also have a cross-platform alternative command line tool called mssql-cli, which you can read more about at https://docs.microsoft.com/sql/tools/mssql-cli.

Configuring with mssql-conf

For SQL Server on Windows, you may be used to using the SQL Server Configuration Manager tool. For SQL Server on Linux, the equivalent tool is called mssql-conf. mssql-conf can be used to configure a variety of settings like enabling SQL Server Agent or setting trace flags. You can read more about all the options supported by mssql-conf at https://docs.microsoft.com/sql/linux/sql-server-linux-configure-mssql-conf.

Note

Configuration inside SQL Server is still supported by using ALTER SERVER CONFIGURATION or sp_configure.

Active Directory Authentication with adutil

SQL Server on Linux supports both SQL and Active Directory (AD) authentication for logins. When we first shipped SQL Server on Linux for SQL Server 2017 and even 2019, setting up AD authentication was fairly complex. We now have a tool called adutil that can simplify this process. The adutil tool works for SQL Server 2019 and 2022 and is also supported for containers. You can read more about how to use adutil at https://docs.microsoft.com/sql/linux/sql-server-linux-ad-auth-adutil-introduction.

Azure Extension for SQL Server

The Azure extension for SQL Server is not currently included in the setup process for SQL Server on Linux. But you can use a method that has already existed to support SQL Server on Azure Arc–enabled servers for Linux as documented at https://docs.microsoft.com/sql/sql-server/azure-arc/connect.

As I described in Chapter 3 of the book, the Azure extension for SQL Server is a Windows service that is part of the Azure Arc Agent framework, which also runs as a Windows service. On Linux, these programs run as daemon programs called himds (Arc agent) and SqlServerExtension.Service (SQL extension).

Optimizing SQL Server 2022 on Linux

Besides the typical SQL Server optimization techniques you use such as proper indexes, there are some optimizations to consider for running SQL Server on Linux:
  • Since we have shipped SQL Server 2017 with Linux, we have learned across customer experiences and benchmarks how to tune the Linux kernel and SQL Server in very specific ways to achieve optimal performance. There is a very detailed list of these recommendations at https://docs.microsoft.com/sql/linux/sql-server-linux-performance-best-practices. If you are just trying out SQL Server on Linux, you may not need to consider any or all of these. However, for the optimal production experience, I highly recommend you read through all of these.

  • There are some settings outside SQL Server on Windows like Instant File Initialization (IFI) and Lock Pages in Memory. These settings do not apply to SQL Server on Linux.

  • You may notice these two messages appear right after you install SQL Server on Linux:

ForceFlush is enabled for this instance
ForceFlush feature is enabled for log durability
My longtime colleague Robert Dorr spent a considerable amount of time researching SQL Server I/O consistency and performance on Linux. This work led to a series of configuration options we recommend, which you can read at https://docs.microsoft.com/sql/linux/sql-server-linux-performance-best-practices#linux-os-configuration under the section titled “SQL Server and Forced Unit Access (FUA) I/O subsystem capability.” And if you love reading the inside story, check out Bob’s blog post at https://techcommunity.microsoft.com/t5/sql-server-blog/sql-server-on-linux-forced-unit-access-fua-internals/ba-p/3199102.
  • Avoid the OOM killer.

Sounds like a horror movie, right? Well, if you encounter an out-of-memory (OOM) situation that kills your program, it can be horrific. SQL Server does a great job of dynamic memory management, but if it consumes too much memory within Linux, it could be subject to this problem. This is why we created a configuration option called memorylimitmb. By default we actually limit the overall memory SQL Server can use on Linux, so this may not be an issue for you. But I encourage you to read more details at https://docs.microsoft.com/sql/linux/sql-server-linux-performance-best-practices#sql-server-configuration under the section titled Set a memory limit with mssql-conf to ensure your instance will be stable.

HADR for SQL Server 2022 on Linux

SQL Server comes with powerful high-availability and disaster recovery (HADR) features including failover clustering, Always On Availability Groups (AGs), and a rich set of BACKUP/RESTORE options.

Let’s look at how each of these can be used with SQL Server on Linux.

Failover Clustering

On SQL Server on Windows, SQL Server is integrated with Windows Server Failover Cluster (WSFC) to support a shared storage automatic high-availability feature. Since WSFC is not available on Linux, you can use a software package called Pacemaker to support a SQL Server failover cluster. You can read more about how to set this up at https://docs.microsoft.com/sql/linux/sql-server-linux-shared-disk-cluster-configure.

Always On Availability Groups (AGs)

The core AG software is all built into the SQL Server engine. This is why for SQL Server on Linux (or Windows), you can set up a clusterless availability group (if you remember I showed you how to do this in Chapter 6 of the book for a Contained Availability Group). This option doesn’t provide automatic failover, but it does provide a replica scheme. We also call this a read-scale availability group because it can be a perfect solution to separate your read-only workloads. You can read more about setting up a read-scale availability group at https://docs.microsoft.com/sql/linux/sql-server-linux-availability-group-configure-rs.

We also support a fully automatic failover solution with AGs using Pacemaker supported on RHEL, Ubuntu, and SLES. You can read more about how to set this up for RHEL at https://docs.microsoft.com/sql/linux/sql-server-linux-availability-group-cluster-rhel. You can also set up this configuration with Azure Virtual Machines, which you can read about at https://docs.microsoft.com/azure/azure-sql/virtual-machines/linux/rhel-high-availability-stonith-tutorial.

HPE Serviceguard

Turns out one of the things we did is take our code to support how SQL Server controls failover for an Always On Availability Group and put it into an open source project, which you can find at https://github.com/Microsoft/mssql-server-ha. By doing this we allowed other partners to integrate their failover solution with SQL Server. And one of those solutions is from HPE called Serviceguard. You can read more about how to use HPE Serviceguard with SQL Server at https://docs.microsoft.com/sql/linux/sql-server-availability-group-ha-hpe.

BACKUP/RESTORE

All the BACKUP/RESTORE features for SQL Server are available on SQL Server on Linux including backup to Azure storage with the URL syntax and the new S3 object storage backup capabilities with SQL Server 2022.

You also read in Chapter 6 of the book that snapshot backups are supported using T-SQL and don’t require you to write a VDI program or rely on Windows VSS. In fact, we also call this feature cross-platform snapshot backups because we wanted to support snapshot backups on Linux.

SQL Server 2022 Containers

I remember back around 2010 when it became very popular to run SQL Server on a virtual machine. The power of abstraction from a bare-metal machine plus being able to consolidate multiple machines on one host revolutionized computing in many ways.

In recent years on Linux systems, the concept of a container has become extremely popular for applications, especially for developers. There is nothing fundamentally new for SQL Server containers in SQL Server 2022. But it is worth reviewing why containers can be used, how to use them, and some interesting scenarios where they can be quite powerful.

Keep up to date with all the latest on SQL Server containers at https://aka.ms/sqlcontainers.

Why Containers?

One myth for containers I always have to bust is that they are used to replace virtual machines. That is actually possible, but I’ve found they typically complement virtual machines. One powerful use of containers is to consolidate applications into multiple containers running in a virtual machine.

Containers are actually an instance of an immutable image that contains a program(s) that is run in an isolated manner with a complete private file system. A container only includes the files necessary to run the programs in it from the image.

For example, SQL Server does not require every file and process that runs on a Linux system. So now you can deploy multiple SQL Server containers in a virtual machine instead of deploying multiple virtual machines, saving space and resource usage. In fact, running multiple SQL containers is exactly how you can run multiple instances of SQL Server in the same virtual machine or computer for Linux since we don’t support named instances.

Another myth about containers is that they don’t perform as well as a normal process because they are abstracted from the underlying Linux OS for threads, memory, I/O, etc. This is not true. Containers are just run in the OS in an isolated manner (containers only know about processes inside them) but have direct access to OS resources. So a SQL container running on Linux should have the same performance as SQL Server on Linux.

Note

Make sure you are comparing the right environment with containers. SQL containers on Windows run on a Linux virtual machine hosted by Windows (or WSL). That is a layer of abstraction that is different than running a container directly in a Linux VM hosted by Linux.

Containers are run by a container runtime like Docker. Effectively a container runtime like Docker understands how to take a container image and execute the program(s) associated with the container in an isolated manner using native Linux APIs (such as namespaces and cgroups). Container runtimes also understand how to manage containers such as stopping and starting a container.

Containers have other advantages including:
  • Portability – Containers can run on any operating system that supports a container runtime like Docker, and you can be assured it is the same container image. So SQL Server containers can run on Windows, Linux, or macOS, and you can be assured it is the same SQL Server engine. Linux containers on Windows and macOS have some type of virtualization that allows them to run Linux programs. This portability is huge for development. Instead of trying to maintain a development server for SQL Server that all developers share, just provide them with SQL Server containers to run on their own environment for any OS.

  • Consistency – Let’s say you want all developers to use a specific cumulative update of SQL Server 2019 (or 2022 when we start shipping CU builds) along with a specific database schema and scripts. Containers provide that capability across any OS. Microsoft produces container images for each CU build of a major version of SQL Server: 2017, 2019, and 2022.

  • Efficiency – Patching SQL Server with a cumulative update on Windows requires you to install a separate program that installs the update on top of the existing RTM build or update. SQL Server on Linux has an upgrade option to upgrade to a newer CU build. Containers do not require any patching. In fact, you can’t patch a container. Instead, you switch containers. Read more in the section titled “The Container Switch Method” to see how it works.

Here is a talk I gave at SQLBits on SQL Server containers to learn more: https://sqlbits.com/Sessions/Event18/Inside_SQL_Server_Containers.

Using SQL Server 2022 Containers

I forgot to mention one of the most powerful capabilities of SQL Server containers. SQL Server containers are a preinstalled SQL Server (and by default Developer Edition, so it is free!). We have built our container images for SQL Server so that when you run them, SQL Server just starts up and executes. Part of the magic is that SQLSERVR.EXE was built to run as a standalone program (in effect, a daemon program) and our container image is set up to configure and start the SQL Server engine.

Running a SQL Server Container

I dusted off my exercises from SQL Server 2019 Revealed where I showed readers the basics of running a SQL Server container. You can find the SQL Server 2022 version of these scripts at https://github.com/microsoft/bobsql/tree/master/demos/sqlserver2022/containers.

How easy is it to run a SQL Server container? If you look at these scripts, you will find the first step is this command:
docker run `
 -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=Sql2022isfast' `
 --hostname sql2022 `
 -p 1401:1433 `
 -v sql2022volume:/var/opt/mssql `
 --name sql2022 `
 -d `
 mcr.microsoft.com/mssql/server:2022-latest

This is an example using the Docker container runtime on Windows with PowerShell (Docker Desktop on Windows can now use the Windows Subsystem for Linux, which means it is run as a virtualized process and not a full Linux VM). By default images are cached locally on your machine. If you run a container and the image does not exist locally, it is first pulled into your local cache. You can pull the image yourself with the docker pull command.

Let’s look at the parameters of this command to run a container:

The -e parameters are to set environment variables (you can use any valid SQL Server Linux environment variable such as setting the edition).

The --hostname parameter is nice because this becomes @@SERVERNAME for the container. This will help scenarios like DTC, replication, and linked servers.

The -p parameter is a port mapping so that you can run multiple instances (multiple SQL instances using port 1433 would be a conflict).

The -v parameter is important. It maps the local directory in the container private file system to persisted storage. In this case Docker takes the name sql2022volume and associates this with a directory on the storage of my local computer that will survive any container restart or delete operation. Otherwise, when a container is deleted, the private file system would be deleted, which contains databases.

The --name parameter is a convenient name to use to manage the container with Docker (e.g., stop the container).

The -d parameter runs the program in the background so any stdout would be suppressed. (Without this the stdout of SQLSERVR.EXE would be displayed, which is the ERRORLOG file entries. Removing this parameter can be helpful to debug problems.)

The last parameter is the location of the container image. Microsoft hosts official product container images in the Microsoft Container Repository or mcr.microsoft.com. In this case the image name
mcr.microsoft.com/mssql/server:2022-latest

is this latest version of SQL Server 2022 images based on Ubuntu. We produce an image for each cumulative update. Using the “latest” tag just gives you the latest version. This was very convenient for me as I just changed 2019-latest to 2022-latest and at the time of the writing of this book, I got the latest preview build of SQL Server 2022.

Note

You can browse a list of container images based on RHEL at https://mcr.microsoft.com/v2/mssql/rhel/server/tags/list and Ubuntu at https://mcr.microsoft.com/v2/mssql/server/tags/list. To be supported you need to make sure and run a container image that matches the host OS. For example, Docker on Windows with WSL is based on Ubuntu, so only the SQL Server images based on Ubuntu are supported. You can learn more at https://docs.microsoft.com/troubleshoot/sql/general/support-policy-sql-server#guidelines.

This base SQL Server container image comes with the core SQL Server engine and tools like sqlcmd.

Get a complete reference for Docker at https://docs.docker.com/reference. I’ve been using Docker for some time, and it is important that you should know that in 2021 Docker announced some changes to their licensing, which could affect your ability to use Docker Desktop for free. Read more at www.docker.com/blog/updating-product-subscriptions. SQL Server containers are OCI compliant, so they are supported by other container runtime engines like podman (https://podman.io).

Since SQL Server containers are based on Linux, all the features supported and not supported by SQL Server on Linux are the same. One exception in SQL Server 2022 is that the Azure extension for SQL Server is not supported with containers.

Connecting to a SQL Server Container

Since this is the SQL Server engine, just connect to the container like you would any SQL Server with tools like SSMS or Azure Data Studio. Remember that if you used port mapping using the -p parameter, you need to use the new port number in the connection string. A simple example to connect to a container with sqlcmd outside the container can be found at https://github.com/microsoft/bobsql/blob/master/demos/sqlserver2022/containers/deploy/step4_querysql.ps1.

Since SQL Server containers include sqlcmd in the container image, you can also connect inside the container. Here is a script that uses sqlcmd inside the container to restore a database backup: https://github.com/microsoft/bobsql/blob/master/demos/sqlserver2022/containers/deploy/step3_restoredb.ps1.

Speaking of backups, you can restore a backup for a SQL Server container from the container file system. You first need to copy the backup file into the container file system. You can see an example with the script at https://github.com/microsoft/bobsql/blob/master/demos/sqlserver2022/containers/deploy/step2_copyintocontainer.ps1.

Building a Customized Container Image

You can see the power of containers, but what about all the other packages for SQL Server on Linux such as Polybase? How do they run in a SQL Server container? Docker includes the ability to build a new container image based on other images and a set of files like packages you install into the image. In fact, the core SQL Server container image is an image we build based on a base Linux OS image combined with our files for SQL Server.

The docker build command is used to build container images. You create a text file called a Dockerfile, which contains a set of commands on how to build the new image. You can see an example of how to build new packages combining the core SQL Server container image with other SQL Linux packages at https://github.com/microsoft/mssql-docker/tree/master/linux/preview/examples. Docker build doesn’t run a container. It is used to build an image that is run as a container with docker run.

This customization capability is very powerful. You can build your own container images that combine the SQL Server core image with your own set of scripts and even database backup files.

One tool to help build a complex set of container images and run them including applications is called docker-compose (https://github.com/docker/compose). Check out this example I built (thanks to my former colleague Vin Yu) to install SQL Server replication with containers using docker-compose at https://github.com/microsoft/bobsql/tree/master/demos/sqlserver2022/containers/replication.

The Container Switch Method

I mentioned earlier in this chapter that you don’t patch SQL Server containers. You switch them to apply a cumulative update. Here is how the magic works. When you run a container and specify a persistent volume with the -v command, any files in this volume survive a container restart or delete. If you put your SQL Server database files (including system databases) into a directory that is mapped to persisted storage, you can point any SQL Server container to this volume, one at time.

In addition, SQL Server cumulative updates are compatible with each other in a major version. Therefore, I could do the following:
  1. 1.

    Start a SQL Server 2022 RTM container image, pointing it to a persisted volume for /var/opt/mssql.

     
  2. 2.

    Let’s say the first cumulative update for SQL Server 2022 comes out and I want to use it.

     
  3. 3.

    Stop the SQL Server 2022 RTM container.

     
  4. 4.

    Then run a new SQL Server container based on SQL Server 2022 CU1, pointing it to the same persisted volume. The new SQL Server will start up and recognize the new master database and user databases. In a manner of minutes, I’ve updated SQL Server.

     
  5. 5.

    Let’s now say a problem occurs and you want to roll back to the SQL Server 2022 RTM container.

     
  6. 6.

    Stop the new SQL Server 2022 CU1 container and start the SQL Server 2022 RTM container. Within a few minutes, you have rolled back.

     

You can see examples of how to do this with SQL Server 2019 from SQL Server 2019 Revealed at https://github.com/microsoft/bobsql/tree/master/sql2019book/ch7_inside_sql_containers/update/dockerpowershell.

SQL Server 2022 on Kubernetes

Supporting Linux for SQL Server opened up a whole new world for us, including containers. Containers opened up the world to us for Kubernetes. Do you want to learn Kubernetes in a fun way? Nothing compares to The Illustrated Children’s Guide to Kubernetes at https://youtu.be/4ht22ReBjno.

Why k8s

Kubernetes originated at Google. The founders of Kubernetes like Brendan Burns (who now works for Microsoft) wanted a way to run containers at scale. Linux on its own didn’t provide for this, so they built a set of software services on top of Linux called Kubernetes. Brendan has built a really nice playlist of videos explaining Kubernetes at www.youtube.com/watch?v=daVUONZqn88&list=PLLasX02E8BPCrIhFrc_ZiINhbRkYMKdPT.

Since SQL Server supports containers, why not also support Kubernetes? Okay, enough of that long word. Let’s use k8s (k<eight letters>s) like the cool kids do. Admittedly the support for SQL Server on k8s is very basic. You can deploy SQL Server, connect to the instance, and store data in persisted storage. But k8s does provide some nice built-in capabilities like load balancers and basic high availability. If you are looking for a complete managed experience using k8s and SQL Server, you should take a look at Azure Arc–enabled SQL Managed Instance (Arc SQL MI) (https://aka.ms/azurearcsqlmi).

Keep up to date on the latest with SQL Server on k8s at https://aka.ms/sqlk8s. You can also watch a presentation I did for SQLBits to dive deeper at https://youtu.be/USfCJDoCMr8

It is worth understanding the basics so you can see how SQL Server on k8s works and whether it could be a platform for you to scale many SQL Server containers.

Deploying SQL Server on k8s

Let’s look at the basics of deploying a single SQL Server container on k8s.

What Are the Major Components of k8s?

There a few terms for k8s you need to understand when you see the deployment experience. This is a very basic set of definitions. For a more complete reference to these components and architecture, see the k8s documentation at https://kubernetes.io/docs/concepts/overview/components.

Let’s review these components through Figure 9-1 (not all components are shown here).

An illustration of the major K 8 components. It has a box marked cluster with 2 node boxes that have 2 and 1, pod private I P address boxes on the left and right.

Figure 9-1

k8s components

  • Cluster

  • Think of a cluster like a host that contains all the k8s internal software and a series of virtual machines or nodes (can be physical servers) to run containers.

  • Node

  • Think of a node like a virtual machine (it can be a physical machine) that can help orchestrate other nodes or run user programs as containers.

  • Control plane

  • All the internal k8s software that runs the system is called the control plane and runs on every node. For example, k8s is all API based, so the API server can take requests and orchestrate across nodes.

  • Pod

  • A pod is defined as one or more containers to execute on a node. It is the smallest component of execution in the k8s system.

  • Container runtime

  • Since containers are at the heart of execution of pods, a container runtime like Docker must exist on all nodes to pull container images and run containers.

  • Namespace

  • A namespace is a logical group of k8s objects and makes it very convenient to organize projects. Every k8s system has a default namespace, but I recommend you always create other namespaces for your pods and other k8s objects.

  • Load balancer service

  • You have seen connection abstraction with SQL Server in the form of a listener. Think of a load balancer service (another type is called a node port) as a listener. A load balancer is a way to abstract a connection to a fixed IP address and port so that if the underlying IP address changes, it will always be directed to the right location.

  • Storage class and PersistentVolumeClaim

  • Storage class is a mapping of the storage types on your system. For example, Azure Kubernetes Service has a storage class defined by the system called managed-premium that gives you access to Azure Premium managed disks. Think of a PersistedVolumeClaim (PVC) as a volume of a fixed size from the storage class. The key to using PVC is that your data is stored in the cluster independent of a node. If you store your database files on a PVC, they will be available no matter what node the pod for SQL Server is running.

  • Replicaset and Statefulset

  • One of the coolest features of k8s is the ability to declare a form of high availability for our pods. For example, you can tell k8s, “I always want one copy of my pod running at any point in time.” So if the pod fails (SQL Server crashes) or even the node fails (VM crashes), k8s will automatically try to start the pod again (could be on the same node or another node if one is available).

How Do I Get k8s?

K8s was built as an open source project, and you can deploy your own k8s cluster based on open source. For example, you can use the open source tool kubeadm to install k8s as documented at https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/.

Most k8s customers use a licensed product for k8s production such as Azure Stack HCI, Red Hat OpenShift, or Rancher. You can use cloud-based k8s systems such as Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), and Amazon Elastic Kubernetes Service (EKS). The big advantage of licensed products is some of the k8s system is managed and you can receive commercial production support.

What Are the Deployment Steps?

We built a simple tutorial for you to try out SQL Server on AKS at https://aka.ms/sqlk8s.

I also built a series of scripts to explain the deployment steps in more detail at https://github.com/microsoft/bobsql/tree/master/demos/sqlserver2022/sqlk8s/deploy.

The concepts are for you to declare to k8s the definition of a pod with a single SQL Server container image. The command line tool kubectl is often used to state your declaration to k8s (kubectl uses the k8s API). You also often use a text file called a YAML file to provide details of your declaration.

In the sample scripts, you do the following steps:
  1. 1.

    Create a new namespace for k8s objects.

     
  2. 2.

    Create a secret for the admin password.

     
  3. 3.

    Create a new load balancer service mapped to port 1433.

     
  4. 4.

    Create a PVC from the managed-premium storage class.

     
  5. 5.

    Use a YAML file to deploy a pod using the SQL Server container image referencing the secret, load balancer, and PVC. In the YAML file, declare a Replicaset of 1 so a SQL Server pod is always running.

     

Deploying the pod is an asynchronous operation. Container images are pulled if they do not exist in a node and executed. The overall orchestration of pods is far more complex than this. K8s has sophisticated algorithms to decide which nodes to schedule and run pods. It also must have event notifications to look for pod failures or resource limits and take action.

Connecting and Using SQL Server on k8s

On a system like AKS, the load balancer IP address and port are available externally, so a tool like SSMS can be used to connect through the load balancer to the SQL instance running in the pod. In my example script at https://github.com/microsoft/bobsql/blob/master/demos/sqlserver2022/sqlk8s/deploy/step12_querysql.ps1, you can see a way to dynamically determine the IP address of the load balancer to connect to SQL Server.

K8s includes built-in services to monitor resource usage to ensure the system remains healthy. One issue we have seen with k8s and SQL Server is that SQL Server can use a lot of memory within the node. It could cause a trigger for k8s to forcefully fail the pod and move it to another node. This is because k8s contains algorithms to detect the amount of memory that should be available within a node to keep it healthy. SQL Server though doesn’t know about these algorithms. Famed SQL and Linux community expert Anthony Nocentino has a nice writeup on this scenario at www.centinosystems.com/posts/2019-09-28-memory-settings-for-running-sql-server-in-kubernetes including how you can address it.

High Availability with k8s

High availability is critical to a production SQL Server, so it is very nice that k8s provides built-in basic high availability within the system.

Basic HA with SQL and k8s

When you deploy a SQL Server pod on k8s and use a load balancer, PVC, and Replicaset, you have already declared to k8s you would like to use basic high availability.

Consider these two scenarios:
  • Pod failure

  • If for any reason SQL Server crashes, this is recognized as a pod failure, and k8s will automatically start a new pod, which will start a new SQL Server container typically on the same node if resources are available. Since all databases are on a PVC, this is like persisted storage for a SQL Server container. SQL Server will just run recovery on all databases.

  • A pod failure can occur within k8s for other reasons, and k8s may decide to start a new pod on a different node. Nodes have private IP addresses, but since you are using a load balancer, your connection is automatically redirected to the new node. And since you are using a PVC, all databases are available even on the new node. This is like a failover cluster with shared storage without you having to set up or configure any failover cluster software.

  • Node failure

  • The same situation can occur if a node fails assuming you have deployed multiple nodes when you built your k8s cluster. In fact, I recommend you have at least three nodes on any k8s cluster for SQL Server. One node will serve to orchestrate other worker nodes, and you will have at least two worker nodes for redundancy.

  • Here are my scripts to see HA in action with SQL Server on AKS at https://github.com/microsoft/bobsql/tree/master/demos/sqlserver2022/sqlk8s/ha.

Always On Availability Groups and k8s

In the preview of SQL Server 2019, we introduced a concept to deploy an Always On Availably Group integrated with k8s. Unfortunately, that capability did not make the final release of SQL Server 2019. The work we did there did evolve into the built-in AG feature (through the Business Critical service tier) that comes with Azure Arc–enabled SQL Managed Instance.

There is another solution that comes with our partner DH2i (https://dh2i.com). You can see how to deploy an AG on k8s with DH2i’s solution at https://docs.microsoft.com/sql/linux/tutorial-sql-server-containers-kubernetes-dh2i.

Helm Charts

Some find the k8s deployment experience a bit complicated. Along comes the concept of Helm charts (https://helm.sh). Think of Helm as a package manager for k8s. You can see how to deploy SQL Server using Helm at https://docs.microsoft.com/sql/linux/sql-server-linux-containers-deploy-helm-charts-kubernetes.

The World of Linux, Containers, and k8s

For many SQL Server on Linux has opened up new doors and possibilities. I’ve spoken to customers who have different parts of their organization use Windows, while others use Linux but need a consistent database platform like SQL Server across them.

But Linux support opens up so many other possibilities. SQL Server containers could change the way your organization builds applications. Imagine developers across your company using the OS of their choice to build applications against the same consistent SQL Server engine and database. In fact, the new Azure SQL Database local development experience (https://docs.microsoft.com/azure/azure-sql/database/local-dev-experience-overview) uses SQL Server containers. Or consider the fact that you can consolidate multiple SQL Server instances in a single VM with containers.

With containers comes k8s, and our Azure Arc–enabled SQL Managed Instance proves you can bring the power of Azure managed services to your infrastructure using k8s.

All of these tell the story of choice for SQL Server. Choice with compatibility. Your application doesn’t have to change, and databases are interchangeable no matter where SQL Server is deployed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.17.127