© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
B. WardSQL Server 2022 Revealedhttps://doi.org/10.1007/978-1-4842-8894-8_6

6. The Meat and Potatoes of SQL Server

Bob Ward1  
(1)
North Richland Hills, TX, USA
 

Several years ago I was preparing for one of the famous “Conor and Bob” shows for the PASS Summit keynote. I created this elaborate idea involving machine learning, analytics, and other types of new features. I showed this to Conor, and he simply said, “Bob, I think we should stick to the meat and potatoes for our show.” Being from Texas I knew what he meant. Highlight the core engine capabilities of SQL Server.

Of course, SQL has core capabilities for things like machine learning and analytics, but what Conor really meant was security, scalability (performance), and availability. For every major release, we have to invest in these three areas, or we really don’t have an engine.

The Core Engine Is the Meat and Potatoes of SQL Server

This chapter is all about the core engine areas of security, scalability (performance), and availability – plus some other “stuff” that doesn’t quite fit those categories. Built-in query intelligence is certainly a part of the core engine, but it deserved its own chapters. Here I’ll cover all the new features to help:
  • Secure your application and data.

  • Ensure your application can scale and perform.

  • Ensure your application is highly available.

And then a list of features that are just “pure goodness” for the engine. You may start reading this chapter and feel that this is “just a list of features.” But every new enhancement in this chapter has a story and a reason behind it. I’ll try to give you those details for many of these new features. These categories are not in a particular order, but I usually start with security.

Security

I’ve always felt our engineering team deeply cares about security, not just because we are the least vulnerable database in the industry for well over a decade (we didn’t produce this – you can see for yourself from the National Institute of Standards and Technology Comprehensive Vulnerability Database) but because we pour in new features to help keep applications and data more secure.

For SQL Server 2022, this includes enhancements for data integrity, encryption, and authorization. The story all starts with a revolutionary innovation called Ledger for SQL Server.

Note

I consider Azure Active Directory (AAD) authentication, Microsoft Purview policy management, and Microsoft Defender all “security” features. But these are covered in more detail in Chapter 3 of the book.

Ledger for SQL Server

Ledger for SQL Server is a tamper-evident record of data stored in a SQL Server database. Blockchain technology was invented to provide a digital ledger using crypto hashes to ensure transactions are valid and can be trusted. The original blockchain concept was brought into the public eye in the late 2000s to support cryptocurrency bitcoin. In fact, when I’ve talked to some people about the term blockchain, they seem to think blockchain = bitcoin. The concept of a digital ledger has so many applications other than supporting bitcoin trading. One of the major benefits of a blockchain is the ability for multiple parties to participate in a transaction system where no one party can be trusted to manage the authenticity of the data. One of the challenges for many blockchain implementations is that they are decentralized, requiring distributing algorithms that often cause performance issues and hinder data management capabilities. Performance and data management capabilities are two things at which SQL Server excels.

Background of Ledger for SQL Server

In 2018, Microsoft began to explore whether the worlds of blockchain and relational databases could converge into a digital ledger solution. According to Panagiotis Antonopoulos, Principal Software Engineer at Microsoft

In 2018, together with Microsoft Research, we started the effort to identify an architecture that would bridge the gap between blockchains and relational databases to provide the benefits of both technologies. We evaluated different design options where SQL Server would run in a decentralized configuration, using the advanced consensus algorithms of blockchains, or in a centralized environment, more aligned with the traditional SQL Server setup. We analyzed the technical feasibility of these options but also talked to a large number of blockchain customers to understand the challenges they were facing when adopting the technology. This was a very educational exercise and made it clear for us that decentralization, despite its security benefits, was making blockchains expensive and complex to develop and manage, causing significant friction to their adoption by enterprise customers.

Note

If you recognize the name Panagiotis Antonopoulos, he is one of the lead developers for the incredible cool technology Accelerated Database Recovery (ADR) we introduced in SQL Server 2019. He just builds great software!

This research effort led to the development of Ledger for SQL Server. The work first started in Azure SQL Database. You can read about the original project at https://aka.ms/sqlledgerpaper.

Ledger for SQL Server uses proven blockchain techniques like cryptographic data structures such as hash-chains and Merkle trees. (The whitepaper goes into the details of what a Merkle tree is for those of you who are interested.) One of the keys to Ledger for SQL Server is to leverage…well, SQL Server. Ledger for SQL Server is built into the product itself requiring no application changes, special provider, or code. Just use T-SQL.

Panagiotis summed up the power of Ledger for SQL Server: “Although the data is stored centrally, individual organizations can use these digests to verify the data integrity. The simplicity and transparency of this solution allows us to commoditize blockchains and enable all SQL users to leverage their benefits at a very low cost and complexity.”

Because of this capability, I like to think of Ledger as a form of data integrity. Let’s learn how it works to see why.

How Does It Work

Ledger for SQL Server is made up of the following components:
  • Ledger tables

  • Users use an extension of the CREATE TABLE syntax to create a ledger table. Ledger tables can either be updateable or append-only. Updateable ledger tables are perfect for auditing scenarios. Append-only tables only allow INSERT statements, which can be a great solution for multi-party trust scenarios.

  • Ledger history tables

  • Updateable ledger tables have a built-in history like a temporal table but with additional transaction information that ties into which SQL principal initiated the change.

  • Ledger views

  • When you create a ledger table, we automatically build a view, which joins the ledger table with the ledger history table to provide a consolidated view.

  • Database ledger

  • The database ledger is a set of system tables, which contains transaction details, crypto hashes, and Merkle trees of all transactions for all ledger tables in the database. This is one of the key components to provide a tamper-evident record of data.

  • Digests

  • The hash of the latest block in the database ledger is called the database digest. It represents the state of all ledger tables in the database at the time when the block was generated. I like to think of a digest like a checksum of the blockchain in the database. A digest is the other key component for a tamper-evident record because a digest is generated and stored outside of SQL Server.

Figure 6-1 shows how these components work together for an updateable ledger table.

A flowchart of S Q L Server with updateable ledger. The flow starts with bank regulator, goes through trusted storage, and ends at database ledger.

Figure 6-1

Ledger for SQL Server with updateable ledger tables

In this example for a company such as a bank, they could create an updateable ledger table by simply using this extension to CREATE TABLE:
WITH
(
  SYSTEM_VERSIONING = ON,
  LEDGER = ON
);

Automatically, SQL Server will create a history table and a ledger view corresponding to the ledger table.

When any modifications are made to the table (INSERT, UPDATE, or DELETE), audits and crypto hashes of each transaction are recorded in the database ledger as well as blocks for the blockchain.

Digests (which are effectively JSON data) can be generated manually or automatically in a separate storage location. This allows someone like a bank regulator to use the digests to verify the integrity of the ledger at any point in time.

Figure 6-2 shows the components for an append-only ledger table.

A flowchart of S Q L Server for append-only ledger. The flow starts with bank regulator, goes through trusted storage, and ends at database ledger.

Figure 6-2

Ledger for SQL Server for append-only ledger tables

Append-only ledger tables don’t need a history table since you can only perform INSERT statements. Append-only ledger tables are great for multi-party scenarios because no one party can “update” the ledger table. Only a history of new records is allowed (even a DBA with sysadmin rights can’t make an update to fake the ledger).

With these components in mind, here is how Ledger provides data integrity, trust, and tamper-evident records of data:
  • Ledger tables have built-in auditing, so you can see the exact datetime and SQL principal for every modification to the table.

  • Append-only ledger tables will only allow an “addition to the record” by only allowing INSERT statements.

  • At any time you can verify that no one has tampered with ledger tables (say under the covers of the engine) through crypto hashes in the database ledger.

  • Digests, stored outside the ledger, can be used to verify no one has tampered with the database ledger.

To understand in more detail how these components work, let’s try out an example.

Exercise to Use Ledger for SQL Server

Let’s walk through an exercise to see how Ledger works. There are four parts to this exercise: using an updateable ledger table, using an append-only ledger table, protection from dropping a ledger table, and an interesting example when I tried to “hack” the ledger.

Prerequisites
  • SQL Server 2022 Evaluation Edition. You must configure SQL Server for mixed-mode authentication.

  • Virtual machine or computer with minimum of two CPUs and 8Gb RAM.

  • SQL Server Management Studio (SSMS). The latest SSMS 18.x will work, but SSMS 19.x has a new visualization for ledger tables, so the examples in this exercise were done with the latest SSMS 19.x build.

  • Get the scripts for this exercise from the book samples at ch06_meatandpotatoessecuritysqlledger.

Note

Any name of identifiable information in this exercise is purely fictional and does not represent any real person.

Setting Up the Exercise

We will create two new SQL logins and a database for the exercise:
  1. 1.
    Execute the script addsysadminlogin.sql to add a sysadmin SQL login. This script executes the following T-SQL statements:
    USE master;
    GO
    -- Create a login for bob and make him a sysadmin
    IF EXISTS (SELECT * FROM sys.server_principals WHERE NAME = 'bob')
    BEGIN
    DROP LOGIN bob;
    END
    CREATE LOGIN bob WITH PASSWORD = N'StrongPassw0rd!';
    EXEC sp_addsrvrolemember 'bob', 'sysadmin';
    GO
     
  2. 2.
    Log in with the SQL login bob you created in step 1. Create the database and add a login for an “app” by executing the script createdb.sql. This script executes the following T-SQL statements:
    USE master;
    GO
    -- Create the ContosoHR database
    --
    DROP DATABASE IF EXISTS ContosoHR;
    GO
    CREATE DATABASE ContosoHR;
    GO
    USE ContosoHR;
    GO
    -- Create a login for the app
    IF EXISTS (SELECT * FROM sys.server_principals WHERE NAME = 'app')
    BEGIN
    DROP LOGIN app;
    END
    CREATE LOGIN app WITH PASSWORD = N'StrongPassw0rd!', DEFAULT_DATABASE = ContosoHR;
    GO
    -- Enable snapshot isolation to allow ledger to be verified
    ALTER DATABASE ContosoHR SET ALLOW_SNAPSHOT_ISOLATION ON;
    GO
    -- Create an app user for the app login
    CREATE USER app FROM LOGIN app;
    GO
    EXEC sp_addrolemember 'db_owner', 'app';
    GO
     

Exercise 1: Using an Updateable Ledger Table

Connect with the SQL login bob for all steps in this first exercise.
  1. 1.
    Create an updateable ledger table for Employees by executing the script createemployeeledger.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    -- Create the Employees table and make it an updatetable Ledger table
    DROP TABLE IF EXISTS [dbo].[Employees];
    GO
    CREATE TABLE [dbo].[Employees](
          [EmployeeID] [int] IDENTITY(1,1) NOT NULL,
          [SSN] [char](11) NOT NULL,
          [FirstName] [nvarchar](50) NOT NULL,
          [LastName] [nvarchar](50) NOT NULL,
          [Salary] [money] NOT NULL
          )
    WITH
    (
      SYSTEM_VERSIONING = ON,
      LEDGER = ON
    );
    GO
    If you are using SSMS 19.X, then you can see the visual properties of ledger tables and the corresponding history table name as in Figure 6-3.

    A screenshot of a folder list under contoso H R. 3 arrows point to dbo dot employees updatable ledger, dbo dot M S S Q L underscore ledger history for underscore 9 0 1 5 7 8 2 5 0 history, and dbo dot employees underscore ledger.

    Figure 6-3

    Ledger tables and views in SSMS Object Explorer

     
  2. 2.
    Populate initial employee data using the script populateemployees.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    -- Clear Employees table
    DELETE FROM [dbo].[Employees];
    GO
    -- Insert 10 employees. The names and SSN are completely fictional and not associated with any person
    DECLARE @SSN1 char(11) = '795-73-9833'; DECLARE @Salary1 Money = 61692.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN1, 'Catherine', 'Abel', @Salary1);
    DECLARE @SSN2 char(11) = '990-00-6818'; DECLARE @Salary2 Money = 990.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN2, 'Kim', 'Abercrombie', @Salary2);
    DECLARE @SSN3 char(11) = '009-37-3952'; DECLARE @Salary3 Money = 5684.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN3, 'Frances', 'Adams', @Salary3);
    DECLARE @SSN4 char(11) = '708-44-3627'; DECLARE @Salary4 Money = 55415.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN4, 'Jay', 'Adams', @Salary4);
    DECLARE @SSN5 char(11) = '447-62-6279'; DECLARE @Salary5 Money = 49744.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN5, 'Robert', 'Ahlering', @Salary5);
    DECLARE @SSN6 char(11) = '872-78-4732'; DECLARE @Salary6 Money = 38584.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN6, 'Stanley', 'Alan', @Salary6);
    DECLARE @SSN7 char(11) = '898-79-8701'; DECLARE @Salary7 Money = 11918.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN7, 'Paul', 'Alcorn', @Salary7);
    DECLARE @SSN8 char(11) = '561-88-3757'; DECLARE @Salary8 Money = 17349.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN8, 'Mary', 'Alexander', @Salary8);
    DECLARE @SSN9 char(11) = '904-55-0991'; DECLARE @Salary9 Money = 70796.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN9, 'Michelle', 'Alexander', @Salary9);
    DECLARE @SSN10 char(11) = '293-95-6617'; DECLARE @Salary10 Money = 96956.00; INSERT INTO [dbo].[Employees] ([SSN], [FirstName], [LastName], [Salary]) VALUES (@SSN10, 'Marvin', 'Allen', @Salary10);
    GO
     
  3. 3.
    Examine the data in the Employees table using the script getallemployees.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    -- Use * for all columns
    SELECT * FROM dbo.Employees;
    GO
    -- List out all the columns
    SELECT EmployeeID, SSN, FirstName, LastName, Salary,
    ledger_start_transaction_id, ledger_end_transaction_id, ledger_start_sequence_number,
    ledger_end_sequence_number
    FROM dbo.Employees;
    GO

    Notice there are “hidden” columns that are not shown if you execute a SELECT *. Some of these columns are NULL or 0 because no updates have been made to the data. You normally will not use these columns, but you can use the ledger view to see information about changes to the Employees table.

     
  4. 4.
    Look at the Employees ledger view by executing the script getemployeesledger.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    SELECT * FROM dbo.Employees_Ledger;
    GO

    This is a view that uses the Employees table and the ledger history table. Notice the ledger has the transaction information from hidden columns in the table plus an indication of what type of operation was performed on the ledger for a specific row.

     
  5. 5.
    Let’s look at the definition of the ledger view by executing the script getemployeesledgerview.sql. This script executes the following T-SQL statements:
    USE [ContosoHR];
    GO
    sp_helptext 'Employees_ledger';
    GO
    The results of this query should look like the following T-SQL code:
    CREATE VIEW [dbo].[Employees_Ledger] AS
    SELECT [EmployeeID], [SSN], [FirstName], [LastName], [Salary], [ledger_start_transaction_id] AS [ledger_transaction_id], [ledger_start_sequence_number] AS [ledger_sequence_number], 1 AS [ledger_operation_type], N'INSERT' AS [ledger_operation_type_desc]
    FROM [dbo].[Employees]
    UNION ALL
     SELECT [EmployeeID], [SSN], [FirstName], [LastName], [Salary], [ledger_start_transaction_id] AS [ledger_transaction_id], [ledger_start_sequence_number] AS [ledger_sequence_number], 1 AS [ledger_operation_type], N'INSERT' AS [ledger_operation_type_desc]
    FROM [dbo].[MSSQL_LedgerHistoryFor_901578250]
    UNION ALL
    SELECT [EmployeeID], [SSN], [FirstName], [LastName], [Salary], [ledger_end_transaction_id] AS [ledger_transaction_id], [ledger_end_sequence_number] AS [ledger_sequence_number], 2 AS [ledger_operation_type], N'DELETE' AS [ledger_operation_type_desc] FROM [dbo].[MSSQL_LedgerHistoryFor_901578250]

    Let’s break down how the view constructs its results. Any rows in the Employees ledger table are an INSERT. Any UPDATE in the history table (MSSQL_LedgerHistoryFor_901578250) is an INSERT and a DELETE, and any DELETE appears in the history table as a DELETE. Note that the history table will always be MSSQL_LedgerHistoryFor_<object id> where <object id> is the object_id of the base ledger table. A UNION ALL is used to bring these together.

     
  6. 6.
    Let’s use the ledger view to get more auditing information about transactions associated with the ledger table by executing the script viewemployeesledgerhistory.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    SELECT e.EmployeeID, e.FirstName, e.LastName, e.Salary,
    dlt.transaction_id, dlt.commit_time, dlt.principal_name, e.ledger_operation_type_desc, dlt.table_hashes
    FROM sys.database_ledger_transactions dlt
    JOIN dbo.Employees_Ledger e
    ON e.ledger_transaction_id = dlt.transaction_id
    ORDER BY dlt.commit_time DESC;
    GO

    I’ve introduced a new system table in this query (which is really a view) called database_ledger_transactions. We can use the transaction ID from the ledger view to join to this table to gain more insights. From this system table, we can see when a transaction was executed, who was the SQL principal to execute the change, and a crypto hash value for the transaction. You can see that “bob” inserted all the rows for the ledger table. You can see a complete definition of this system table at https://docs.microsoft.com/sql/relational-databases/system-catalog-views/sys-database-ledger-transactions-transact-sql.

     
  7. 7.
    Let’s generate a digest to ensure we can verify the ledger by executing the script generatedigest.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    EXEC sp_generate_database_ledger_digest;
    GO

    The result of this procedure is JSON data. You could verify the ledger at this point, but instead we will run this again after an update to verify the ledger, so you don’t need to save the output of this query.

     
  8. 8.
    You can see the generated blockchain in the database ledger by executing the script getledgerblocks.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    SELECT * FROM sys.database_ledger_blocks;
    GO

    The system table database_ledger_blocks (which is really a view) represents the internal blockchain in the database ledger.

     
  9. 9.
    Try to update Jay Adam's salary to see how Ledger can track changes by executing the script updatejayssalary.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    UPDATE dbo.Employees
    SET Salary = Salary + 50000
    WHERE EmployeeID = 4;
    GO
     
  10. 10.

    Execute the script getallemployees.sql again. If you didn’t know anything about what the original data was in the Employees table, you would not know if any updates had occurred.

     
  11. 11.

    Execute the script viewemployeesledgerhistory.sql again. Now the results show what we need. We can see the original INSERT for Jay Adam’s salary, but we also see someone named bob tried to update it (INSERT/DELETE combo) directly.

    Generate another digest by executing the script generatedigest.sql again. Save this output value (including the brackets) to be used for verifying the ledger.

     
  12. 12.
    You can now verify the ledger (1) after initial population of data and (2) after the update was made to Jay Adam’s salary. Verify the current state of the ledger by first editing the script verifyledger.sql. Put in your most recent saved digest JSON inside the N' ' quotes. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    EXECUTE sp_verify_database_ledger
    N'<saved digest JSON>'
    GO

    Execute the script. Your output should list the same block_id as in the digest JSON data.

    You have now verified the ledger tables match crypto hashes, the blockchain, and the saved digest. You can now trust the ledger data is accurate in that bob is the one who updated Jay Adam’s salary.

     

Exercise 2: Using an Append-Only Ledger

While Ledger for SQL Server is great for auditing which SQL principal makes any transaction change, what about scenarios where an application uses an “application login” to execute all SQL queries? How will you know the original application user that initiated an operation that led to a change of data?

For this scenario, we will build an append-only ledger table, which records information from the application including the user-initiated application operations that lead to SQL changes. With one exception noted in this exercise, execute all queries while connected with the bob login created in the first exercise.
  1. 1.
    Create an append-only ledger table for auditing the application by executing the script createauditledger.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    -- Create an append-only ledger table to track T-SQL commands from the app and the "real" user who initiated the transaction
    DROP TABLE IF EXISTS [dbo].[AuditEvents];
    GO
    CREATE TABLE [dbo].[AuditEvents](
          [Timestamp] [Datetime2] NOT NULL DEFAULT (GETDATE()),
          [UserName] [nvarchar](255) NOT NULL,
          [Query] [nvarchar](4000) NOT NULL
          )
    WITH (LEDGER = ON (APPEND_ONLY = ON));
    GO

    The application has been enhanced to insert a row in this table for any transaction executed against the Employees table. Notice you do not need the SYSTEM_VERSIONING clause.

     
  2. 2.
    Now log into a new connection using the app login created in the first exercise (this login was created in the createdb.sql script). Execute the script appchangemaryssalary.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    UPDATE dbo.Employees
    SET Salary = Salary + 50000
    WHERE EmployeeID = 8;
    GO
    INSERT INTO dbo.AuditEvents VALUES (getdate(), 'bob', 'UPDATE dbo.Employees SET Salary = Salary + 50000 WHERE EmployeeID = 8');
    GO

    This script simulates what the application would do if a user tried to update Mary’s salary.

     
  3. 3.

    Using the connection for the sysadmin login bob, execute the script viewemployeesledgerhistory.sql. You can see from the results that the app login changed Mary’s salary. This is accurate, but who really initiated the change?

     
  4. 4.
    Execute the script getauditledger.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    SELECT * FROM dbo.AuditEvents_Ledger;
    GO

    You can see that the application registered that bob initiated a query (the app login records the query) that resulted in Mary’s salary change. Since the table is append-only, no other user could go in and try to erase any record of what the application did.

     

Exercise 3: Protect Ledger Objects

In this exercise, you will learn how SQL Server protects ledger objects.
  1. 1.
    Execute the script getledgerobjects.sql to see a history of what ledger tables and columns have been created. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    SELECT * FROM sys.ledger_table_history;
    GO
    SELECT * FROM sys.ledger_column_history;
    GO
     
  2. 2.
    Try to alter or drop ledger objects. Execute the script admindropledger.sql. This script executes the following T-SQL statements:
    USE ContosoHR;
    GO
    -- You cannot turn off versioning for a ledger table
    ALTER TABLE Employees SET (SYSTEM_VERSIONING = OFF);
    GO
    -- You cannot drop the ledger history table
    DROP TABLE dbo.MSSQL_LedgerHistoryFor_901578250;
    GO
    -- You can drop a ledger table
    DROP TABLE Employees;
    GO
    -- But we keep a history of the dropped table
    SELECT * FROM sys.objects WHERE name like '%DroppedLedgerTable%';
    GO

    You can see from the output you cannot turn off versioning for a ledger table (there is no syntax to set LEDGER = OFF). You cannot drop the history table either. You can drop the ledger table, but we keep a record of what was dropped. And you can’t drop the “dropped” ledger table. Furthermore, even if you drop a ledger table, the transactions associated with this table remain in the database ledger (system tables). The only way to remove the existence of dropped ledger tables would be to drop the database.

    You can also execute the script getledgerobjects.sql again to see the drop history. SSMS in Object Explorer can also show you what ledger tables were dropped.

     

Exercise 4: What Does a Tampered Ledger Look Like?

When I first heard about Ledger, I thought to myself, “I know internals pretty well. Can I beat the system?” So I went on a journey of trying to tamper with the ledger.

I can’t share the details of what I did, but effectively after a basic T-SQL update, I attempted ways that were outside the boundaries of supported SQL to hack the ledger. I attempted to conceal an update to a ledger table by manipulating the database ledger system tables in an extremely undocumented way. I actually found a way to get this working until Panagiotis said to me, “What about the digest?” Got me! I literally could hack my way into SQL Server to make it look like my update never existed, but the digest is on separate storage, so I couldn’t hack that. You can see the results of my early tries at this in the T-SQL notebook ledger.ipynb in the samples directory. You can also see me do this live with Buck Woody at the 2022 SQLBits keynote at https://youtu.be/_R9FE2ZclVk (start at about 12 minutes in).

What Else Should You Know?

Ledger for SQL Server is amazing technology in so many ways. For me, what I’ve realized is that SQL now provides solutions for industry problems like multi-party trust scenarios, which are not traditional solutions for SQL.

In fact, very early on we had customers start to embrace this technology to solve real problems. Lenovo saw an opportunity to build trust into their supply chain. Their supply chain spans multiple parties, but who oversees the data? That is the power of Ledger. You can read Lenovo’s story at https://customers.microsoft.com/story/1497685499820529889-lenovo-manufacturing-azure-SQL-database-ledger.

As a team, we are committed to finding new and innovative ways to bring Ledger into the market where SQL was never thought of as a solution.

One very nice feature added to Ledger for SQL Server is automatic digest storage. In the example, you learned how to manually generate and use a digest. SQL Server 2022 can be configured to automatically save a digest to Azure Blob Storage. You can read all the details at https://docs.microsoft.com/sql/relational-databases/security/ledger/ledger-how-to-enable-automatic-digest-storage.

As with any feature, there are some limits, and Ledger is no different. Keep up to date with all the limits at https://docs.microsoft.com/sql/relational-databases/security/ledger/ledger-limits.

As I worked on Ledger, I built up my own set of questions that I worked on with Panagiotis and Pieter Vanhove, our lead program manager over Ledger:
  • How is this different from a temporal table?

    Ledger uses temporal table functionality for updateable ledger tables, but unlike temporal tables it has built-in auditing and supports append-only tables.

  • How is this different from SQL Server Audit?

    Ledger has auditing baked into transaction changes, and all the audits are stored in the database. Furthermore, all changes are tracked with crypto hashes and separate digest storage. Once a ledger is created, you can’t undo it where system admins with the right privilege can erase audits. The only way to erase ledger tables once created is to completely drop the database, but even then, separate digests exist.

  • How often do I need to save the digest?

    Digests are the primary mechanism to verify the ledger from tampering. Theoretically, you would need to generate a separate digest for every transaction, but that is not practical. Save digests as often as you need to verify the ledger for your business needs. Automatic digest management saves a digest every 30 seconds.

  • Does Ledger require more space?

    Updateable ledger tables require pretty much the same extra space as a temporal table. Each transaction requires separate system table rows for crypto hash. There is minimal space required for the internal blockchains, but it all depends on how long you keep the ledger. Digests are small in nature, and again you need to keep a history as long as you keep the database with ledger tables.

  • Is there any performance impact?

    We have not run specific performance benchmarks with Ledger because we believe the impact will be minimal. Consider that append-only ledger tables should not see any impact. Updateable ledger tables could see the same impact as temporal tables.

Keep up with all the latest information on Ledger for SQL Server at https://aka.ms/sqlledger.

Encryption Enhancements

Encryption of data is a key aspect to security, and SQL Server provides several different capabilities to encrypt data end-to-end. End-to-end means from a connection to SQL Server, data in the SQL Server engine, and data stored at rest. To read about the complete lineup of encryption capabilities of SQL Server, check our documentation at https://docs.microsoft.com/sql/relational-databases/security/encryption/sql-server-encryption.

In SQL Server 2022 we have a few enhancements around encryption including enhancements for Always Encrypted, crypto enhancements, and enhancements for connection encryption.

Always Encrypted Enhancements

In SQL Server 2016 we introduced a new concept for end-to-end encryption called Always Encrypted. The concept is that data is encrypted at the client application and always encrypted in the SQL Server engine (hence the name). Furthermore, the keys to unencrypt the data are owned by the application. As a result, administrators of SQL Server cannot unencrypt the data; only the application can do this. If you are not familiar with Always Encrypted, you can read about how to configure and use it at https://docs.microsoft.com/sql/relational-databases/security/encryption/always-encrypted-database-engine.

Because data can never appear in plaintext inside the SQL Server engine, there were limitations with the original design. For example, queries using operators such as LIKE won’t work because the engine would need to unencrypt the data into plaintext to perform the operation.

In SQL Server 2019 we enhanced Always Encrypted with a concept called secure enclaves. A secure enclave is a secure region of memory inside the SQL Server engine. Even in the database engine, code cannot see data in plaintext in the enclave. Enclave technologies provide a method for SQL Server to pass keys from the application to the enclave to perform any operations required in plaintext such as pattern matching operations. SQL Server 2019 supports a concept called virtualized enclaves or VBS enclaves (Azure supports hardware enclaves through Intel Software Guard Extensions (Intel SGX)). Since decryption happens on the server, I have personally found that using secure enclaves can also boost performance for almost any operation on columns that are encrypted with Always Encrypted.

You can read the complete story of Always Encrypted with secure enclaves at https://docs.microsoft.com/sql/relational-databases/security/encryption/always-encrypted-enclaves.

In SQL Server 2022 we have enhanced Always Encrypted with secure enclaves with the following new capabilities:
  • SQL Server 2022 can use multi-threads and key caching inside the VBS secure enclave to further boost performance.

  • SQL Server 2022 can support new join types and ORDER BY and GROUP BY operations. This support is in line with Azure SQL Database support.

Crypto Enhancements

We want to make sure SQL Server is up to date with the latest standards for cryptography or crypto. SQL Server uses crypto for many different types of capabilities in the engine including certificate and key support. According to Shoham Dasgupta, Senior Program Manager for security for SQL, “We wanted to enhance default cryptography in SQL Server to meet/exceed industry standards and protect customer’s confidential data to meet the evolving threat landscape.”

One of the enhancements we made internally that is not visibly seen by customers is a strengthening of our algorithms to align with the current National Institute of Standards and Technology (NIST) including the following:
  • All system-generated certificates have a minimum strength of RSA-3072, a bit key size recommended by NIST.

  • Enhance our internal hashing algorithm used for signature generation to use SHA-2 512, a more secure method than SHA-1.

Another investment to strengthen security to today’s standards is support for personal information exchange (PFX) format. SQL Server supports the ability to create a certificate inside the database engine to secure objects, connections, and data. SQL Server 2022 now supports creating certificates for PFX. PFX or PKCS#12 is a modern format for certificates, and SQL Server now supports using a PFX file to create a certificate in SQL Server with the CREATE CERTIFICATE statement. You can read more about how to create a certificate with a PFX file at https://docs.microsoft.com/sql/t-sql/statements/create-certificate-transact-sql.

The final enhancement for crypto is to support backup and restore for keys to and from Azure Blob Storage. A database master key is often used to secure a credential such as used for external data sources. But you often need to back up this key separate from the database. SQL Server 2022 allows you now to back up or restore one of these keys to or from Azure Blob Storage instead of a local or network file path. You can read how to use the new URL syntax at https://docs.microsoft.com/sql/t-sql/statements/backup-master-key-transact-sql. The same support exists for SYMMETRIC keys, which you can read about at https://docs.microsoft.com/sql/t-sql/statements/backup-symmetric-key-transact-sql.

Strict Connected Encryption

The Tabular Data Stream (TDS) is the data protocol used for applications to connect to and transfer data to and from SQL Server. TDS has been around for as long as SQL Server has been a product. As time has gone by, TDS has been enhanced and changed using a version numbering system. Typically, as new versions of TDS were released, applications would need to use a provider or driver to support it. Microsoft changed the major version of TDS to 7.0 when SQL Server 7.0 was released in 1998. Since then, minor version changes have come with the latest, TDS 7.4, supporting SQL Server 2012 and later. Users, admins, and developers don’t need to know the details of the TDS protocol, just the features it supports with changes or the need to use updated drivers. As a support engineer for many years at Microsoft, I often needed the details of TDS to solve complex customer problems. You can read all the details yourself at https://docs.microsoft.com/openspecs/windows_protocols/ms-tds/b46a581a-39de-4745-b076-ec4dbb7d13ec.

SQL Server 2022 introduces a new major version of TDS, TDS 8.0. Drivers and providers using TDS 7.4 are still fully supported. New drivers understand TDS 8.0 and can use this protocol. The main reason behind this new version is connecting encryption. SQL Server has supported encryption of the communication between an application and SQL Server for years. Like many applications, SQL Server uses Transport Layer Security (TLS) to implement communication encryption. SQL Server has always allowed applications to make decisions on whether they want to encrypt communications with SQL Server using connection string options called Encrypt and TrustServerCertificate. You also have the ability on the server side to force encryption for client connections. The combination of these two connection strings is significant, and you can read more details at https://docs.microsoft.com/dotnet/framework/data/adonet/connection-string-syntax#using-trustservercertificate.

TDS 8.0 introduces the concept of strict connection encryption. If an application uses a value of Encrypt = strict, then the TDS 8.0 protocol is enabled. One of the significant changes here is that if strict connection encryption is used, then the communication between the client and SQL Server must be encrypted (hence the term strict). The other significance is how TDS 8.0 handles connection encryption. Prior to TDS 8.0, SQL Server would perform a handshake or prelogin with the client that was not encrypted. Now with TDS 8.0, that handshake is completely encrypted with TLS. This works with all TLS versions. This is very significant because it aligns with the standards of protocols like HTTPS and allows network appliances to safely pass through SQL communication. The latest providers are required to be able to use TDS 8.0. You can learn more at https://docs.microsoft.com/sql/relational-databases/security/networking/tds-8-and-tls-1-3#strict-connection-encryption.

In addition, but independent of TDS 8.0, SQL Server 2022 now supports the latest version of TLS 1.3. TLS 1.3 is the most secure TLS version. You can read more about differences between TLS 1.2 and 1.3 at https://docs.microsoft.com/sql/relational-databases/security/networking/tds-8-and-tls-1-3?#differences-between-tls-12-and-tls-13.

Note

We are also looking into ways to enforce this option at the server level when the product becomes generally available.

Security Permission Enhancements

In SQL Server 2022 we also have introduced some enhancements for authorization or permission scenarios. This includes new fixed server-level roles and a new permission for dynamic data masking.

New Fixed Server-Level Roles

Up until SQL Server 2022, the product came with a certain set of fixed server-level roles. They are fixed because you cannot change the access level or definition of the role. They are server-level because they apply to the entire SQL Server instance, not just a single database.

You are probably familiar with some of these like sysadmin, securityadmin, dbcreator, and public. You can view the complete list of these roles at https://docs.microsoft.com/sql/relational-databases/security/authentication-access/server-level-roles.

One request we have seen from customers for some time is for more granularity of roles we provide with the product – in other words, more fixed roles that have fewer privileges for specific tasks. My colleague at Microsoft, Andreas Wolter, wrote an interesting blog on the topic of least privilege at https://techcommunity.microsoft.com/t5/azure-sql-blog/security-the-principle-of-least-privilege-polp/ba-p/2067390.

SQL Server 2022 now provides built-in new fixed server-level roles that follow this principle. You can see the complete list at https://docs.microsoft.com/sql/relational-databases/security/authentication-access/server-level-roles?#fixed-server-level-roles-introduced-in-sql-server-2022.

I picked out a few I found interesting:
  • ##MS_DefinitionReader##

  • A member of this role is granted permissions to view catalog views that require the VIEW ANY DEFINITION or VIEW DEFINITION permission with a database (assuming they have a user account in the database). It is a great way to give someone the right to “view” important information about a server or database without allowing them to make changes.

  • ##MS_ServerStateReader##

  • A member of this role is granted permissions to view Dynamic Management Views (DMVs) or functions that require the VIEW SERVER STATE or VIEW DATABASE state within a database (assuming they have a user account within the database).

  • ##MS_ServerPerformanceStateReader##

  • This is similar to the previous fixed role but is more focused on viewing specific performance information from Dynamic Management Views (DMVs). Users in this fixed role are granted the same permission as objects that require the VIEW SERVER PERFORMANCE STATE permission.

Imagine a situation where you want to hire a consultant to come in and view important information about the system and database to provide you with an analysis. You don’t want to give access to any user data for privacy reasons, nor do you want to allow them to make changes. Adding them as members to these two roles could help you achieve this, whereas before you would have to use a fixed server-level role that had more privileges than necessary or set up custom permissions to their account for the instance and in all databases for which you need their consultation.

You will also love that SSMS 19 has been enhanced to support these new fixed server-level roles as you can see in Figure 6-4.

A screenshot of a folder titled server roles. It contains 19 files under it.

Figure 6-4

SQL Server 2022 fixed server-level roles

I did a quick test to see the difference for membership in the ##MS_ServerPerformanceStateReader## role vs. just the public role. If you have just the public fixed role, you are allowed to see your session only when you query sys.dm_exec_requests. But if you are a member of the ##MS_ServerPerformanceStateReader## role, you can see all sessions from sys.dm_exec_requests. An example of a DMV that would require ##MS_ServerStateReader## or ##MS_ServerSecurityStateReader## membership is the sys.dm_server_audit_status DMV. Anyone in just the ##MS_ServerPerformanceStateReader## would not have access to this DMV. Effectively what we have done here is separate ##MS_ServerStateReader## into two different roles: ##MS_ServerPerformanceStateReader## and ##MS_ServerSecurityStateReader##.

Dynamic Data Masking Enhancements

Dynamic data masking is the concept of providing logic within a table schema to mask data when viewed by a user (think of masking an email address with characters like xxxx@xxxx). We introduced this concept in SQL Server 2016, and it was quite popular. Prior to this feature, applications would have to put masking logic in their code. If the masking logic needed to be changed, the application would have to be changed and deployed. Dynamic data masking provides a way to define mask rules when defining columns in a table. If you are new to dynamic data masking, start with our documentation at https://docs.microsoft.com/sql/relational-databases/security/dynamic-data-masking.

One nice aspect of dynamic data masking is that we provide built-in functions for common mask scenarios. So a column definition like this
Email   varchar(100) MASKED WITH (FUNCTION = 'email()') NOT NULL

would result in any value stored in this column seen by users as [email protected].

You can see a complete list of functions for dynamic data masking at https://docs.microsoft.com/sql/relational-databases/security/dynamic-data-masking?#defining-a-dynamic-data-mask.

In SQL Server 2022 we added a new function for datetime column types. You can mask all or part of the datetime values such as year, month, day, hour, minute, or second.

The original implementation of dynamic data masking by default did not allow “non-admin” users to see unmasked data. However, we provided a T-SQL command option called GRANT UNMASK so that a user could be granted the ability to see data unmasked. The problem with this original design is that the unmasking permission was for any masked rule in the database.

In 2021, we introduced the concept of granular UNMASK to Azure so that this permission could be applied at the schema, table, or column level. You can see the original announcement at https://azure.microsoft.com/updates/general-availability-dynamic-data-masking-granular-permissions-for-azure-sql-and-azure-synapse-analytics/.

SQL Server 2022 simply brings this implementation from the cloud to SQL Server. You can see examples for granular unmasking at https://docs.microsoft.com/sql/relational-databases/security/dynamic-data-masking?#granular.

Performance and Scalability

I always love to see the innovation and enhancements in any SQL Server release for performance and scalability. One of the reasons is that almost everything in this area comes from tuning exercises, real customer stories, or research we do for benchmarking. SQL Server 2022 comes with a variety of investments in performance and scalability across the entire spectrum of workload and engine components.

Columnstore and Batch Mode Improvements

SQL Server 2022 comes with enhancements to boost performance for columnstore indexes and batch-mode operations (which do not require columnstore in SQL Server 2019 and later).

Ordered Clustered Columnstore Index

Both Azure Synapse Analytics and SQL Server support columnstore indexes. An entire book could be devoted to columnstore indexes and why they can be beneficial to applications. I still find this an underused feature of SQL Server. Almost any customer I encounter who doesn’t know columnstore immediately sees how they can use them once they understand how they work. If you don’t know columnstore, go right to the documentation to get started at https://docs.microsoft.com/sql/relational-databases/indexes/columnstore-indexes-overview.

Azure Synapse Analytics added an option to create a clustered columnstore index as ordered. While building an ordered clustered columnstore index can take longer, the performance benefits to certain queries can be significant. SQL Server 2022 now supports ordered clustered columnstore indexes. Learn more at https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/performance-tuning-ordered-cci.

Columnstore String Improvements

We don’t have specific documentation on this one, but when I chatted with Ryan Stonecipher, the veteran SQL engineer about this, his quote to me was Simply rebuild your indexes and we will maintain min/max values for deep data (e.g., *char and *binary + guid). Furthermore, we now support LIKE pushdown operations on column stores and a fast string-equal implementation.”

Vector Extension to Improve Batch Mode

In SQL Server 2016 we added code to take advantage of special processor instructions called vector-based hardware capabilities. Vector-based hardware capabilities allow programs to simply process data faster and in more parallel manner due to vector registers in the processor. If you would love to read some very deep content on this topic, here is a nice article: www.techspot.com/article/2166-mmx-sse-avx-explained.

SQL Server 2016 could recognize two different types of vector capabilities, namely, SEE and AVX, and use them to speed up columnstore operations especially when using batch mode. Robert Dorr outlines how we did this in this blog post: https://docs.microsoft.com/archive/blogs/bobsql/sql-2016-it-just-runs-faster-column-store-uses-vector-instructions-sseavx.

The latest vector hardware technology is AVX-512, and SQL Server 2022 can recognize this new hardware capability to gain even more performance specifically for batch-mode operators for columnstore and rowstore. I was curious what led us to looking into AVX-512 and found this interesting quote from Conor Cunningham:

During the pandemic, I found an academic paper that explained how to use AVX-512 instructions to do a fast quicksort. These instructions essentially do N operations at a time instead of one, meaning that it can potentially speed up performance-critical code substantially. I implemented the sort from the paper and started making it better than the paper. Eventually I wanted to start playing with implementing something in SQL’s code. There is logic that we have that uses the prior generation of vector instructions (AVX2) and I looked at that code and figured out how to write an equivalent version that used the newer, wider vectors as a side project. When I had it finished, we got good performance gains on our key analytics benchmark (TPC-H). The fun part is that I did not tell anyone I was doing this until one day I showed up and just gave this present to the team – it wasn’t a formally funded effort that was intended for SQL Server 2022, but we prioritized it once we realized what value it delivered for customers.

At the time of the writing of this book, we were keeping this enhancement under a trace flag, and it is possible we will launch the product with this still requiring a trace flag – mostly because the use of this capability is an advanced concept and not for everyone. Check out the documentation at https://aka.ms/sqlserver2022docs for the latest information.

One aspect of this work is a new ERRORLOG entry you may see that looks like this:
CPU vectorization level(s) detected:  SSE SSE2 SSE3 SSSE3 SSE41 SSE42 AVX AVX2 POPCNT BMI1 BMI2

These monikers are various vector CPU capabilities we have detected in the processor. AVX-512 would appear if you are using a chipset that supports this new vector model.

Columnstore Segment Elimination Enhancements

Columnstore index columns are organized into segments. Columnstore includes a concept called segment elimination so that segments can be eliminated or not ready depending on the query. Prior to SQL Server 2022, only columns with data types of numeric, date, and time and datetimeoffset with scale less than or equal to two were candidates for segment elimination. Now in SQL Server 2022, we have extended segment elimination to new data types including string, binary, and guid data types and the datetimeoffset data type for scale greater than two.

Scalability Improvements

There are several improvements to help scalability of your queries, workload, and use of SQL Server. The last thing you want is to invest in revised application design, new servers, or infrastructure only to have SQL Server prevent you from scaling. All the work in this area is a direct result of customer cases or benchmark and testing we do internally or with our partners.

Buffer Pool Parallel Scan

If you look at my site for all my talks (I’m an open source presenter), https://aka.ms/bobwardms, you will find a folder called SQL PASS 2013. At the PASS Community Summit in 2013, I gave a half-day talk on memory. If you look at this deck, you will see that I have some internals on the buffer pool of SQL Server. As part of these slides, I talk about a BUF hash table and BUF structure. BUF structures point to pages, and a BUF hash table is a way to organize all the BUF structures to easily find a page. Therefore, finding a specific page is not all that hard using a lookup of the BUF hash table.

Unfortunately, some operations in SQL Server must scan all the BUF structures. Think of this like scanning a clustered index vs. using an index seek. And like a clustered index scan, if you only need a few rows, it still may take a very long time to get the results if the clustered index is large.

The same problem can occur for certain operations with the buffer pool, and we would see this type of scenario with customers. They would have a server with, say, 1TB of RAM and allow the buffer pool to consume most of this memory. Then they would go perform some maintenance operation, and even against a very small database, it would take an unusually long time. This is because we would scan the entire ~1TB of BUF structures to finish the operation. We even added ERRORLOG messages to warn the user this condition was happening. See the article https://docs.microsoft.com/troubleshoot/sql/performance/buffer-pool-scan-runs-slowly-large-memory-machines for more information. This is a classic problem of an architecture that for years didn’t cause problems, but as hardware technology became vastly bigger, this design needed to be updated.

We took a look at this problem and said something to the effect, “If queries can run in parallel to obtain a faster result, why can’t we do this when scanning the buffer pool?” (This is not an exact quote; it’s my paraphrasing when I read the design documents for this.)

The result of this work is buffer pool parallel scan, which is enabled by default in SQL Server 2022 for systems with large amounts of memory. My colleague David Pless at Microsoft worked on this feature and explains how it works at https://cloudblogs.microsoft.com/sqlserver/2022/07/07/improve-scalability-with-buffer-pool-parallel-scan-in-sql-server-2022.

David shows some great numbers we’ve seen for improvements in SQL Server 2022. In addition, check out his Data Exposed episode where he shows even more details with a nifty demonstration at https://youtu.be/4GvU106Xiag. I have included an example of the results David found in a T-SQL notebook you can view with Azure Data Studio in the example ch06_meatandpotatoesscalabilityBuffer_pool_parallel_scan_BPP_notebook_quick.ipynb.

Purvi’s List

I have encountered so many talented, kind, and wonderful people at Microsoft over the years who work behind the scenes with no fanfare. They just do their job, do it well, and are incredibly nice and friendly. One of those people is Purvi Shah. Purvi is a senior software engineer in the SQL team, but her role is larger than that title. Purvi works in our “performance” team and literally spends her time finding ways to make SQL Server and Azure SQL faster – but faster on a level that most of us don’t understand. Her and the team’s work can be felt in so many ways that you may never see because it is often enhancements into the engine that are not a “feature.”

Note

One notable example she worked on for SQL Server 2019 was PFS page concurrency for tempdb, which was huge!

For this release I reached out to Purvi and asked her to share her list of scalability enhancements that I want our readers to know about. Here is her list:
  • Reduced buffer pool I/O promotions – There were cases where we would cause system issues by taking a single page read and promoting it to an eight-page I/O. Turns out this wasn’t really helping performance and could overload an I/O system, so we tuned this algorithm. This is simply tuning our engine to be smarter when performing I/O operations.

  • Enhanced some of our core spinlock algorithms – Spinlocks are a huge part of the consistency inside the engine for multiple threads. We are always looking to tune these algorithms to ensure the engine is running at its best in all types of workloads. We made some internal adjustments to make spinlocks more efficient, and the improvements are so deep into the engine it is hard to quantify its effect (Purvi and the team do this at a very deep level to ensure their work actually makes everything better and not worse).

  • Improved Virtual Log File (VLF) algorithms – Virtual Log File (VLF) is an abstraction of the physical transaction log. Having a high number of small VLFs based on log growth can affect the performance of operations like recovery. We changed the algorithm for how many VLFs we create during certain log grow scenarios to reduce these types of problems. You can read about how we have changed this algorithm in our docs at https://docs.microsoft.com/sql/relational-databases/sql-server-transaction-log-architecture-and-management-guide?#virtual-log-files-vlfs.

  • Instant File Initialization (IFI) for log file growth – Up until SQL Server 2022, we could not use Instant File Initialization (IFI) on the transaction log because we didn’t think we could properly detect the “end of the log.” With SQL Server 2022 we have determined that when we have to grow the transaction log, we can use IFI if the grow size is considered “medium,” which is up to 64Mb in size. This one could be a big performance win for scenarios where zeroing the log file during grow operations would cause blocking problems. Turns out 64Mb is exactly the default auto growth size for SQL Server data and log files when creating a new database.

Purvi and her team do such great work to keep the SQL Server engine tuned and fast. I asked Purvi more about why she likes her job at Microsoft, and she said, “Keeping both SQL Server and Azure SQL running at its peak performance is part of my job. I strive to do it to best of my abilities. Of course, I love my job! Hope I am able to make a positive contribution to continue our mission.”

“Hands-Free” tempdb

As great as a product as SQL Server is, after working with SQL Server for 29 years, I’ve seen some issues that to this day just keep me pulling my hair out. One of these issues is latch contention with tempdb.

What Is the Challenge?

All database pages in SQL Server need physical protection using latches. Some database pages are called system pages because they have allocation information; these are known as PFS, GAM, and SGAM pages. Any query or operation that requires allocation or deallocation may need to read or write from or to these pages. Reads and writes from and to these pages require a latch protection scheme. Latch protection is needed because two threads could be trying to modify or read a page at the same time (that is a very basic description of how latches work).

These pages are at fixed locations within database files. In addition, every table has pages to store data and indexes. Any queries or operations that affect these pages will also require latch protection. Because system tables have pages like any other table, specific DDL statements in SQL will eventually lead to the latch protection and contention for pages for system tables.

Now consider tempdb. Workloads for SQL Server that involve tempdb typically require a great deal of allocation and system table page operations. Temporary tables, table variables, and spills are just a few that invoke these operations. All of these typically require fast and frequent page allocations and system table modifications. Therefore, when many users run queries concurrently that result in these operations within tempdb, there is naturally contention or pressure on latches for pages for allocation and system tables. The main symptom of this problem is seeing your request waiting on a PAGELATCH_XX wait type.

Initially, when latches were first introduced in SQL Server 7.0, I would see latch contention for tempdb on PFS, GAM, and SGAM pages because back then the default behavior was a single file for database pages. Typically, what would happen is that an application would use temporary tables across multiple concurrent users, which would result in frequent CREATE and DROP operations, which required allocation and deallocation operations. This naturally put pressure on PFS, GAM, and SGAM pages in a single file.

Solutions Over the Years

When we first started to really see this problem in SQL Server 7.0 and even into SQL Server 2000, many of us in CSS would recommend that customers create multiple files for tempdb data. Why? Because we were naturally partitioning latch concurrency for these allocation pages. This is because SQL Server will allocate pages in a round-robin fashion and leverage a proportional fill algorithm across multiple files for data pages. This solution really helped many of these latch contention problems. But then a new problem surfaced. How many files do you need to create to solve the problem? Unfortunately, many people went overboard and suggested that you would need a number of files to match the number of CPUs for SQL Server. I and others in the community, like Paul Randal, thought differently. For example, we knew that for a 64-CPU machine you would probably not need 64 files. Common sense said that it seemed like it could cause more harm than good. I eventually did some detailed research on this topic, which led me to recommend that using eight files for eight CPUs and then adding by four until you see the performance balance for your workload would be the best approach. You can see more details about this approach in a presentation I did way back in the PASS Summit in 2011 at www.youtube.com/watch?v=SvseGMobe2w.

Another thing we did is introduce trace flag 1118 so that all allocations for tables used uniform extents to ease pressure on the SGAM pages of files. We also discovered that using multiple files to ease system page latch contention only worked if all the files were the exact same size. As a result, we introduced trace flag 1117 so that any autogrow activities would grow all files to keep them the same size.

Any true bottleneck requires peeling back the layers like an onion. Even with this approach of multiple files, a system could have so many users that PFS, GAM, and SGAM latch contention could still happen, but it paled in comparison to the new main problem: system table page latch contention. We struggled at first to recognize this. We were used to running scripts that would recognize specific pages for latch contention: PFS, GAM, and SGAM.

Once we saw the hot spot was system table pages, we honestly had no recourse but to tell customers to reduce their tempdb workload contention. We did create some solutions that would help called tempdb caching. We discovered that we could save metadata about temporary tables, including some pages for patterns that were repeated. You can see more about this concept at https://docs.microsoft.com/sql/relational-databases/databases/tempdb-database?#performance-improvements-in-tempdb-for-sql-server. Even with tempdb caching, it did not eliminate the problem, and not all tempdb scenarios can use caching.

In SQL Server 2016 we helped customers ensure multiple files were created for tempdb by including a default configuration within setup to create the number of files to match the number of CPUs detected up to eight. We also removed the need for trace flags 1117 and 1118 by turning these into ALTER DATABASE options and making them the default for tempdb.

Workloads continued to put pressure on tempdb, and we still saw issues. We introduced two big changes in SQL Server 2019 to help the latch problem:
  • PFS pages now would no longer have latching problems. We found ways to update PFS pages even under a shared latch.

  • We introduced tempdb metadata optimization as a new server option. When this option is enabled (a server restart is required), we convert critical system tables in tempdb to memory-optimized tables. Memory-optimized tables are “latch-free,” so any reads or writes from or to these tables do not experience latch contention.

The improvements were huge. With these two enhancements, I saw many workloads no longer experience significant issues due to latch contention. But the core problem could occasionally still be seen with latch contention on SGAM and GAM pages. When I first saw SGAM contention continue to occur, I was puzzled because we changed tempdb to always use uniform extents. However, there is one page that still has to be allocated for temporary tables or indexes that is not with a uniform extent, and that is an IAM page, which tracks allocations for an object.

Why Now “Hands-Free”?

When I saw the initial set of engine improvements for project Dallas, I was very happy to learn that GAM and SGAM concurrencies were enhanced in the same manner that PFS pages had been in SQL Server 2019.

Now tempdb could truly be “hands-free” with regard to latch issues.

Consider this:
  • PFS, GAM, and SGAM latch contentions are completely gone.

  • If you execute this statement ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED TEMPDB_METADATA = ON and restart SQL Server, critical system tables are memory optimized and won’t encounter latch contention.

Note

tempdb latching issues were never about slow I/O. However, you should always make sure to eliminate any I/O issues for tempdb by placing tempdb data and log files on a separate drive from your user data and preferably on a SSD. If you have I/O issues for tempdb, they are typically seen in the form of PAGELATCH_IO, IO_COMPLETION, and WRITELOG (tempdb log) wait types.

Try It Yourself

If you want to see how we have hopefully forever removed tempdb latch contention for users, go through the following exercise:

Prerequisites
  • SQL Server 2022 Evaluation Edition.

  • VM or computer with four CPUs and at least 8Gb RAM.

  • SQL Server Management Studio (SSMS). The latest 18.x or 19.x build will work.

  • Download ostress.exe from https://aka.ms/ostress. Install using the RMLSetup.msi file that is downloaded. Use all defaults.

  • Use the scripts from the book samples as found in the ch06_meatandpotatoesscalability empdb directory.

Note: This exercise will require you to restart the instance multiple times.

Set Up the Exercise

Follow these steps to set up the exercise:
  1. 1.

    Configure perfmon to track SQL Server SQL Statistics:SQL Statistics/Batch requests/sec (set Scale to 0.1) and SQL Server:Wait Statistics/Page latch waits/Waits started per second.

     
  2. 2.
    Execute the script findtempdbdbilfes.sql and save the output. A script is provided for the end of this demo to restore your tempdb file settings. This script executes the following T-SQL statements:
    USE master;
    GO
    SELECT name, physical_name, size*8192/1024 as size_kb, growth*8192/1024 as growth_kb
    FROM sys.master_files
    WHERE database_id = 2;
    GO
     
  3. 3.
    Start SQL Server in minimal mode using the command script startsqlminimal.cmd. Minimal mode (/f startup parameter) allows you to remove tempdb files and only have one data file. This script executes the following commands:
    net stop mssqlserver
    net start mssqlserver /f
     
  4. 4.

    Execute the command script modifytempdbfiles.cmd. This will execute the SQL script modifytempdbfiles.sql to expand the log to 200Mb (avoid any autogrow) and remove all tempdb files other than the first one (tempdev). If you have more than four tempdb files, you need to edit this script to remove all of them except for tempdev.

    modifytempdbfiles.cmd executes the following command:

     
sqlcmd -E -imodifytempdbfiles.sql
modifytempdbfiles.sql executes the following T-SQL statements:
USE master;
GO
ALTER DATABASE tempdb MODIFY FILE (NAME=templog, SIZE = 200Mb, FILEGROWTH = 65536Kb);
GO
ALTER DATABASE tempdb REMOVE FILE temp2;
GO
ALTER DATABASE tempdb REMOVE FILE temp3;
GO
ALTER DATABASE tempdb REMOVE FILE temp4;
GO

Steps for the Exercise

You are now ready to run the exercise, which is comprised of three different tests. In each test you will run the same workload and observe perfmon counters and overall workload duration.

Test 1: Disable tempdb metadata optimization and disable GAM/SGAM enhancements.

The first test shows performance of a workload where no optimizations are enabled. But since we are using SQL Server 2022, even with one file, PFS concurrency enhancements are enabled since this was built into the code in SQL Server 2019. This is why you won’t see any latch contention on PFS pages.
  1. 1.
    Run the script disableoptimizetempdb.cmd. This script runs the following commands:
    sqlcmd -E -idisableopttempdb.sql
    net stop mssqlserver
    net start mssqlserver
    disableopttempdb.sql executes the following T-SQL statements:
    ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED TEMPDB_METADATA = OFF;
     
GO
Even though tempdb metadata optimization is not on by default, we will disable it to confirm.
  1. 2.
    Run the script disablegamsgam.cmd. This scripts runs the following commands:
    net stop mssqlserver
    net start mssqlserver /T6950 /T6962

    These are undocumented trace flags that disable new SQL Server 2022 enhancements. These are not supported and only used for the purposes of this exercise.

    Now the server is running with no optimizations except for PFS concurrency carried over from SQL Server 2019.

     
  2. 3.
    Load the script pageinfo.sql into SSMS. When the workload starts, you will run this to see what the latch contention looks like. This script executes the following T-SQL statements:
    USE tempdb;
    GO
    SELECT object_name(page_info.object_id), page_info.*
    FROM sys.dm_exec_requests AS d
      CROSS APPLY sys.fn_PageResCracker(d.page_resource) AS r
      CROSS APPLY sys.dm_db_page_info(r.db_id, r.file_id, r.page_id,'DETAILED')
        AS page_info;
    GO

    This script uses T-SQL built-in functions introduced in SQL Server 2019 to “crack a page” and look at its contents. In addition, it uses a function to get metadata about a page such as the object_id it belongs to and page type.

     
  3. 4.

    Execute the script tempsql22stress.cmd from the command prompt with the following syntax (you only need . in PowerShell):

     
. empsql22stress.cmd 25

Move forward to the next steps but come back after the command is over to look more at the syntax.

This script executes the following command with ostress.exe:
"c:Program FilesMicrosoft CorporationRMLUtilsostress" -E -Q"declare @t table (c1 varchar(100)); insert into @t values ('x');" -n%1 -r1000 -q

Let’s look at the arguments:

-Q is used to run a query. Notice this is a batch to use a table variable. This particular syntax won’t allow for temporary table caching, so we are putting a lot of pressure on allocation and system tables.

-n is number of users. We’ve supplied a value of 25 concurrent users.

-r is the number of iterations of the query each user will run.

-q suppresses all result set processing to make it faster as we don’t care about looking at the results.
  1. 5.

    Execute the pageinfo.sql script and observe the results. Notice the latch contention is for pages that belong to system tables in tempdb. There are hardly any latch waits on GAM pages because system table page latch contention is the “hot spot.”

     
  2. 6.

    Observe perfmon counters. Record the averages and max values to compare with the second and third tests.

     
  3. 7.

    Observe the duration of the workload tempsql22stress.cmd as displayed by elapsed time. Record this value to compare with the second and third tests.

     

Test 2: Enable tempdb metadata optimization and disable GAM/SGAM enhancements.

We will run the exact same test, except this time we will enable tempdb metadata optimization and disable the new SQL Server 2022 enhancements.
  1. 1.

    Run the script optimizetempdb.cmd. This script will execute optimizetempdb.sql, which turns ON tempdb metadata optimization.

     
  2. 2.

    Run the script disablegamsgam.cmd again to use trace flags to disable new GAM/SGAM optimizations.

     
  3. 3.

    Run tempsql22stress.cmd 25 as you did in Test 1.

     
  4. 4.

    Execute pageinfo.sql as you did in Test 1. Now you should see that the majority of latch contention problems are on GAM pages.

     
  5. 5.

    Observe perfmon counters and elapsed time. Record these numbers to compare them with Test 1.

     

You should see fewer latch waits started per second, but a lower number of batch requests/sec and a longer duration. This is because the latch waits are held longer, causing a drop of throughput from Test 1. If we had multiple files, we could greatly reduce the GAM contention, and we would get better performance (and feel free to test this yourself). But with SQL Server 2022, there is another way.

Test 3: Enable tempdb metadata optimization and enable GAM/SGAM enhancements (default for SQL Server 2022).

Since tempdb metadata optimization is already enabled, all we need to do is restart SQL Server (with no trace flags) to see the built-in enhancements for GAM and SGAM contention.
  1. 1.
    Execute the script restartsql.cmd. This script runs the following commands:
    net stop mssqlserver
    net start mssqlserver
     
  2. 2.

    Run tempsql22stress.cmd 25 again.

     
  3. 3.

    Execute the pageinfo.sql script. You should not see any latches.

     
  4. 4.

    Observe perfmon counters and elapsed time. Record these values to compare them with Tests 1 and 2.

    Note Did you notice one latch started per second on this final test? Why would that occur if there was no latch contention? It is because there are a few system pages, namely, the BOOT and FILE_HEADER pages, that may still have a small amount of latch contention. However, there should be no performance impact, and that is why I said 99.99% latch-free!

     
Who won the test for you? If all worked as planned, the results – in order of who was faster – should be
  1. 1.

    Test 3 – Enable tempdb metadata optimization and enable GAM/SGAM concurrency (on by default).

     
  2. 2.

    Test 1 – Disable tempdb metadata optimization and disable GAM/SGAM enhancements.

     
  3. 3.

    Test 2 – Enable tempdb metadata optimization and disable GAM/SGAM.

     

Test 3 should win because we have all optimizations turned on.

Test 1 was second because even though system table page latch contention was heavy, latch wait times were shorter for each latch wait.

Test 2 was the worst because latch wait times for GAM pages were very long.

Note

Use the restoretempdbfiles.cmd and restoretempdbfiles.sql to restore your tempdb configuration.

You may be wondering what the results would look like for Test 3 if you had multiple tempdb files. Just about the same as with one file, which I know may seem amazing after all these years of using multiple files. My advice is to stick with the recommended number of multiple files; just let SQL Server setup configure it for you. It is another reason I’m advocating “hands-free” installation: Run SQL Server setup with the defaults and turn on tempdb metadata optimization. Then just run your workload. Time will tell whether my bold statement will become true for you and the community.

More Concurrency Improvements

Concurrency problems in SQL Server hurt performance and scalability. Ensuring that workloads can run concurrently but not experience blocking or waiting problems is essential to maximize your investments for application and infrastructure for scale.

SQL Server 2022 solves more real-world scenarios including concurrency issues for shrink database and automatic statistics update.

Shrink Database Concurrency

In SQL Server 2014 we added an option for online index builds to take locks at lower priorities called WAIT_AT_LOW_PRIORITY. Even an online index build (see the documentation at https://docs.microsoft.com/sql/relational-databases/indexes/how-online-index-operations-work?#source-structure-activities) has to obtain locks and may have to wait. During this wait period, other operations might be blocked by the online index build that could normally continue. WAIT_AT_LOW_PRIORITY allows users that would normally be blocked to continue.

It turns out that shrink database and shrink file operations have these same issues. Therefore, in SQL Server 2022 we added a new WAIT_AT_LOW_PRIORITY for DBCC SHRINKDATABASE and DBCC SHRINKFILE. You can see how the new syntax works at https://docs.microsoft.com/sql/t-sql/database-console-commands/dbcc-shrinkdatabase-transact-sql. In general, it is recommended to avoid shrinking a database or its files, but sometimes it is a necessity.

Auto-update Stats Concurrency

As we were building out the story for SQL Server 2022 preview launch, one of the developers I know, Parag Paul, reached out to me: “Bob, have you seen the auto-update stats improvement coming in SQL Server 2022?” I told Parag I thought I remembered something in Azure SQL about this, but I didn’t know much about it or that it would be in SQL Server 2022.

We discussed the details, and I then remembered a blog post by my colleague Dimitri Furman, who not only is one of the top SQL experts in our team but is one of my “go-to” people at Microsoft on Azure SQL. You can read his blog post at https://techcommunity.microsoft.com/t5/azure-sql-blog/improving-concurrency-of-asynchronous-statistics-update/ba-p/1441687.

The problem is that while SQL Server and Azure SQL can be configured for auto-update statistics to happen asynchronously, the background processes doing the updates can cause concurrency issues. Dimitri has a very nice, detailed diagram and explanation of the problem in this blog post. Effectively, the issue is concurrency between queries and background processes needing schema locks on system metadata. We introduced a new database-scope configuration option called ASYNC_STATS_UPDATE_WAIT_AT_LOW_PRIORITY. When you turn this on, the background processes to update stats will wait for all running processes to release locks that could cause a concurrency issue, effectively running at a “lower priority.” This option is off by default because it is possible this option could result in statistics not being updated as often as desired (or as often as folks are used to now), which could lead to undesirable query plans. But we added this option because we saw scenarios where the user wanted to favor concurrency over more frequent, timed updates of statistics.

You can read how to configure this database-scoped option at https://docs.microsoft.com/sql/t-sql/statements/alter-database-scoped-configuration-transact-sql.

Availability

While all data and workloads need security and performance, any credible production system must have high availability and dependable disaster recovery. In SQL Server 2022 we have a rich set of new features that span Always On Availability Groups, recovery and redo, backup and restore, and replication.

Contained Availability Groups

At the 2018 PASS Summit, I found myself on stage with my former Microsoft colleague Amit Banerjee presenting the latest updates on SQL Server 2019. We had just announced the first Public Preview of SQL Server 2019 at the Microsoft Ignite conference in September of 2018.

For PASS we surprised the audience by announcing that SQL Server 2019 would support instance-level object replication with Always On Availability Groups. So no longer would you have to manually replicate linked servers, SQL Agent jobs, and logins after a failover for a secondary replica. We literally received a standing ovation from the audience because this has been such a long-standing request. Unfortunately, we were premature in making this announcement. I remember when we had to make the decision not to ship this in SQL Server 2019. I was devastated, and I’m sure all the community was as well.

In early 2022 we continued to evaluate what would be in project Dallas. We had already announced the Private Preview, and I remember we discussed about reviving this capability. In comes the heroics of Kevin Farlee, a longtime program manager veteran in our team, and David Liao, Principal Software Engineer on the SQL Server team. They made it their mission to complete the work we had started in the past and ship this feature in SQL Server 2022. We would call it Contained Availability Groups. We made a small announcement about this at the SQLBits conference in March of 2022 and made it more widely known when we announced SQL Server 2022 CTP 2.0 at Microsoft //build. I asked Kevin about his thoughts on the importance of getting this into the product:

When we first shipped Always On Availability Groups in SQL Server 2012, it was very well received, and a big step forward in the HA space. However a request quickly popped up and has been upvoted and reinforced to this day: “It’s a pain to make sure all of the replica instances in an AG have the users, logins, permissions, agent jobs, and all the rest of the infrastructure configuration in sync.” Customers worked around this by creating jobs or manual processes to force synchronization, but that was not a good solution. As it turned out, some of the Azure HA configurations such as Managed Instance as well as Azure Arc-enabled SQL Managed Instance had the same challenges so we combined forces and arrived at a solution to replicate portions of Master and MSDB databases for each AG. This is what we now call Contained AGs when used in the SQL Server product. I am very happy to be able to ship this to customers.

So are we, Kevin!

How Does It Work?

The concept of a Contained Availability Group is to include a version of the master and msdb system databases inside the availability group, along with user databases. If anyone connects to the listener of the availability group or a user database context directly inside the AG, any instance information in master and msdb will be seen from these system databases. Anyone connecting to the instance directly would see the normal master and msdb databases. It is not a copy of the instance master and msdb databases; it is a special version of these databases used in the context of the AG. On creation of a Contained AG, these system databases are empty except any existing admin logins are copied into the contained AG master so that an admin can connect to the AG and start adding instance objects.

Figure 6-5 shows the various components of a Contained AG.

An illustration of various components of a Contained A G. Listener or direct d b context in the center is connected to primary and secondary. The primary and secondary are connected by log changes.

Figure 6-5

The Contained AG architecture

When a user creates an availability group, there is a new syntax keyword for the WITH clause called CONTAINED. When this keyword is used, SQL Server will create two databases called <agname>_master and <agname>_msdb and add these to the availability group. Existing admin logins in the master database of the primary are added to the Contained AG master.

When a user connects to the listener of the AG or a direct database context (i.e., uses the database name in the connection string), this user is considered in the context of the Contained AG. Any operation that would normally apply to master or msdb applies to the Contained AG master and msdb. For example, if you were connected to the listener and created a new SQL Server Agent job, that job would exist in <agname>_msdb and NOT in the msdb of the instance of the primary. Since the <agname>_msdb is part of the AG, any logged operations such as adding a new job are replicated to the <agname>_msdb on the secondary.

If a failover occurs, the replicated Contained AG master and msdb will now be used for new primary listener connections.

When I first saw this design, I was puzzled about SQL Server Agent. How could SQL Server Agent know to look for jobs in the Contained AG msdb? Well, turns out we enhanced SQL Server Agent in 2022 to recognize multiple msdb databases that may be needed to use for Agent jobs. Any job in the Contained AG msdb only gets executed on the primary replica.

You can read more about how Contained AGs work at https://docs.microsoft.com/sql/database-engine/availability-groups/windows/contained-availability-groups-overview.

I think you will understand further how this works by seeing an example.

Let’s Try It Out

Let’s walk through an exercise to see how a Contained Availability Group works. For this exercise I’m going to set up a clusterless availability group. In other words, I’m not going to use a Windows cluster, so my availability group does not have auto-failover capabilities. A clusterless availability group is the simplest method to show a Contained Availability Group. Because I’m using a clusterless availability group, you can also easily set up this exercise on Linux. You can also set up your own type of availability group and still use most of the examples here.

Caution

The steps for this exercise are very detailed. You need to follow each step carefully, or this exercise will not work.

Prerequisites
  • Two Virtual Machines running Windows Server each with four vCPUs and 8Gb RAM. The VMs need to be on the same network and subnet. I used Azure Virtual Machines and put the two VMs in the same resource group, which automatically puts them in the same VNet and subnet. To make my scripts easier, I manually added each IP address for each VM in the C:windowssystem32driversetchosts file. This allows me to use server names instead of direct IP addresses. Designate one of the VMs as the primary and one as the secondary. In the following steps, I’ll refer to the primary or secondary instance or VM.

  • SQL Server Evaluation Editon installed on each VM with mixed-mode authentication enabled (also known as SQL Server and Windows authentication mode). We will use SQL logins to make the exercise simpler. You can do this during install or post install (post install requires an instance restart).

  • On each VM, create firewall rules that allow incoming traffic for ports 1433 and 5022.

  • Enable the Always On Availability Group feature for each SQL instance using the SQL Server Configuration Manager and restart SQL Server. You can read more about how to do this at https://docs.microsoft.com/sql/database-engine/availability-groups/windows/enable-and-disable-always-on-availability-groups-sql-server#SQLCM2Procedure (ignore the instructions about the Windows cluster). This requires an instance restart.

  • Use SQL Server Management Studio (SSMS). You can use the latest 18.x build, but 19.x includes new functionality to see Contained Availability Groups in Object Explorer and graphic interfaces to create a Contained AG. We will “do it the hard way” in this exercise using T-SQL, but SSMS 19.X will recognize a Contained AG.

  • Get the scripts from the book samples from the ch06_meatandpotatoesavailabilitycontainedag directory.

Steps for the Exercise
  1. 1.
    Connected directly to the instance using the default sysadmin login from setup, execute the script sqlsysadminlogin.sql on both the primary and secondary instances. This script executes the following T-SQL statements:
    USE master;
    GO
    CREATE LOGIN sqladmin WITH PASSWORD = '$Strongpassw0rd';
    GO
    EXEC sp_addsrvrolemember 'sqladmin', 'sysadmin';
    GO

    You will use this login to connect to each instance for the rest of the exercise.

     
  2. 2.

    Start the SQL Server Agent service on both the primary and secondary VMs.

     
  3. 3.

    Connect with SSMS on the primary VM to both the primary and secondary using the sqladmin login.

     
  4. 4.
    Execute the script dbmcreds.sql on both the primary and secondary instances. This script executes the following T-SQL statements:
    USE master;
    GO
    CREATE LOGIN dbm_login WITH PASSWORD = '$Strongpassw0rd';
    GO
    CREATE USER dbm_user FOR LOGIN dbm_login;
    GO
     
  5. 5.
    Execute the script createcert.sql only on the primary instance. This script executes the following T-SQL statements:
    USE master;
    GO
    CREATE MASTER KEY ENCRYPTION BY PASSWORD = '$Strongpassw0rd';
    GO
    DROP CERTIFICATE dbm_certificate;
    GO
    CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm';
    GO
    BACKUP CERTIFICATE dbm_certificate
    TO FILE = 'C:Program FilesMicrosoft SQL ServerMSSQL16.MSSQLSERVERMSSQLDATAdbm_certificate.cer'
    WITH PRIVATE KEY (
    FILE = 'c:Program FilesMicrosoft SQL ServerMSSQL16.MSSQLSERVERMSSQLDATAdbm_certificate.pvk',
    ENCRYPTION BY PASSWORD = '$Strongpassw0rd');
    GO

    This script assumes the default installation path for SQL Server on Windows. You will have to modify this script if your install path is different. You will also need to modify this script for Linux to match the correct install path.

     
  6. 6.

    Copy the files dbm_certificate.cer and dbm_certificate.pvk from the path of the primary VM into the same exact file path on the secondary VM (the default is C:Program FilesMicrosoft SQL ServerMSSQL16.MSSQLSERVERMSSQLDATA).

     
  7. 7.
    Execute the script importcert.sql only on the secondary instance. This script executes the following T-SQL statements (you will need to modify the path if it is not the default as in the following):
    USE master;
    GO
    CREATE MASTER KEY ENCRYPTION BY PASSWORD = '$Strongpassw0rd';
    GO
    CREATE CERTIFICATE dbm_certificate
        AUTHORIZATION dbm_user
        FROM FILE = 'C:Program FilesMicrosoft SQL ServerMSSQL16.MSSQLSERVERMSSQLDATAdbm_certificate.cer'
        WITH PRIVATE KEY (
        FILE = 'c:Program FilesMicrosoft SQL ServerMSSQL16.MSSQLSERVERMSSQLDATAdbm_certificate.pvk',
        DECRYPTION BY PASSWORD = '$Strongpassw0rd');
    GO
     
  8. 8.
    Execute the script dbm_endpoint.sql on both instances. This script executes the following T-SQL statements:
    USE master;
    GO
    CREATE ENDPOINT [Hadr_endpoint]
        AS TCP (LISTENER_PORT = 5022)
        FOR DATA_MIRRORING (
          ROLE = ALL,
          AUTHENTICATION = CERTIFICATE dbm_certificate,
          ENCRYPTION = REQUIRED ALGORITHM AES
          );
    GO
    ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;
    GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];
    GO
     
  9. 9.
    Edit the script createag.sql. Put in your two VM server names (that you put in the hosts file) for <node1> (primary) and <node2> (secondary). This script executes the following T-SQL statements:
    USE master;
    GO
    CREATE AVAILABILITY GROUP [ag1]
        WITH (CLUSTER_TYPE = NONE, CONTAINED)
        FOR REPLICA ON
            N'<node1>' WITH (
                ENDPOINT_URL = N'tcp://<node1>:5022',
                AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
                FAILOVER_MODE = MANUAL,
                SEEDING_MODE = AUTOMATIC,
                SECONDARY_ROLE (ALLOW_CONNECTIONS = ALL)
                ),
            N'<node2>' WITH (
                ENDPOINT_URL = N'tcp://<node2>:5022',
                AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
                FAILOVER_MODE = MANUAL,
                SEEDING_MODE = AUTOMATIC,
                SECONDARY_ROLE (ALLOW_CONNECTIONS = ALL)
                );
    GO
    ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;
    GO

    Notice the use of the option CLUSTER_TYPE = NONE (clusterless) and CONTAINED. Any type of availability group will support CONTAINED. I’m just using a clusterless type to simplify the exercise.

    For my system, my script looks like this:
    USE master;
    GO
    CREATE AVAILABILITY GROUP [ag1]
        WITH (CLUSTER_TYPE = NONE, CONTAINED)
        FOR REPLICA ON
            N'bwsql2022vm1' WITH (
                ENDPOINT_URL = N'tcp://bwsql2022vm1:5022',
                AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
                FAILOVER_MODE = MANUAL,
    SEEDING_MODE = AUTOMATIC,
                SECONDARY_ROLE (ALLOW_CONNECTIONS = ALL)
                ),
            N'bwsql2022vm2' WITH (
                ENDPOINT_URL = N'tcp://bwsql2022vm2:5022',
                AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
                FAILOVER_MODE = MANUAL,
                SEEDING_MODE = AUTOMATIC,
                SECONDARY_ROLE (ALLOW_CONNECTIONS = ALL)
                );
    GO
     
  10. 10.
    Execute the script joinag.sql on the secondary instance. This script executes the following T-SQL statements:
    USE master;
    GO
    ALTER AVAILABILITY GROUP [ag1] JOIN WITH (CLUSTER_TYPE = NONE);
    GO
    ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;
    GO
     
  11. 11.

    You should now have a Contained Availability Group up and running. Let’s make sure it is working by using Object Explorer in SSMS. If you expand the folder for Availability Groups, you should see something like Figure 6-6.

     

A screenshot of a folder titled always on high availability. It has various folders and files under it. The cursor points to always on high availability.

Figure 6-6

A Contained Availability Group in SQL Server 2022

You can see the availability group is made up of the Contained AG master and msdb. SSMS 19 has new icons to recognize a Contained AG.
  1. 12.

    Since you are directly connected to the instance (not through a listener or direct db context), you can see these databases as user databases. Expand the Databases folder in Object Explorer. You should see something like Figure 6-7.

     

A screenshot of a folder titled databases. It has system databases and database snapshots folders with files under it.

Figure 6-7

Contained AG master and msdb databases

When using the listener or direct db context, you will see that we have a different view.
  1. 13.
    Execute the script createdb.sql on the primary instance to create a new database and back it up. This script executes the following T-SQL statements (change the file path for the backup to your system needs):
    CREATE DATABASE letsgomavs;
    GO
    ALTER DATABASE letsgomavs SET RECOVERY FULL;
    GO
    BACKUP DATABASE letsgomavs
    TO DISK = N'c:Program FilesMicrosoft SQL ServerMSSQL16.MSSQLSERVERMSSQLBackupletsgomavs.bak' WITH INIT;
    GO

    At the time I created this demo, my beloved Dallas Mavericks were in the heat of battle for the NBA Western Conference playoffs, so I couldn’t resist using their name here. Sadly, they lost to the eventual champions, Golden State, in the Conference finals.

     
  2. 14.
    Execute the script dbjoinag.sql against the primary instance to join the new database to the availability group. This script executes the following T-SQL statements:
    USE master;
    GO
    ALTER AVAILABILITY GROUP [ag1] ADD DATABASE letsgomavs;
    GO

    If you refresh the availability group in SSMS, you will see the database letsgomavs is now part of the AG.

     
  3. 15.

    Now it gets interesting. Connect to the primary instance using a direct database context. I find it easiest to launch another copy of SSMS and connect to the primary instance. But this time on the SSMS login screen, select Options and the Connection Properties tab. Put in letsgomavs in the Connect to database field like in Figure 6-8.

     

A screenshot titled S Q L server. The entries for network packet size and connection time-out are 4096 bytes and 30 seconds respectively, with the network protocol set at default.

Figure 6-8

Connect to the primary instance with a database context

You can achieve the same concept by using a listener. Since I’m using a clusterless availability group, I’ll just connect using the database context.
  1. 16.

    Use SSMS to look at the list of databases. You will only see letsgomavs, and if you expand system tables, you will see master and msdb. These are the Contained AG master and msdb databases.

     
  2. 17.

    While you are connected to the primary instance, use SSMS to create a SQL Server Agent job. It doesn’t matter what is in the job. I just used the Object Explorer interface to create a new job called testjob with no steps.

     
  3. 18.

    Launch another SSMS that is directly connected to the instance. Use Object Explorer to expand the list of jobs for SQL Server Agent. You will not see the job there because it is in the Contained AG msdb database.

     
  4. 19.

    Connect to the secondary instance using the same technique with SSMS to connect with the letsgomavs database. Expand the jobs for SQL Server Agent in Object Explorer, and you will see the job you created previously. This demonstrates that jobs are replicated to secondary instances.

     
  5. 20.

    Let’s try a failover. Using SSMS directly connected to the primary instance (not using the database context), right-click the availability group and select Failover like in Figure 6-9.

     

A screenshot of an overlapping menu. The cursor points to failover.

Figure 6-9

Manual failover for a Contained AG

Follow the steps in the wizard to fail over to the secondary instance. When connecting to the secondary, use the instance connection (do not provide the database context) and use the sqladmin SQL login created earlier in this exercise. Since we are using a clusterless AG with a manual failover, you will get warnings about data loss. Ignore these to complete the failover and click the option to confirm you know there may be data loss.
  1. 21.

    When the failover is complete, you can use SSMS connected directly to the primary or secondary instance to see the switch in roles. You can also connect using a database context to the new primary to see your database and SQL Server Agent job ready to use.

     

Items to Consider

There are a few limits and considerations for Contained AGs. For example, replication is not supported for databases that are in a Contained AG. But features like CDC, log shipping, and TDE are supported with some special instructions for use. You can learn more about these in our documentation at https://docs.microsoft.com/sql/database-engine/availability-groups/windows/contained-availability-groups-overview#interactions-with-other-features.

Other AG Enhancements

We have made other minor enhancements to the core of Always On Availability Groups including reliability fixes, diagnostics, and enhancements for Distributed Availability Groups.

AG Reliability and Supportability

We shored up some reliability issues for Always On Availability Groups (no changes required for you; these are built in) including the following:
  • The database recovery task is now run with a higher deadlock priority to avoid being chosen as a deadlock victim with user transactions.

  • We fixed a problem where a replica database would get stuck in a recovery pending state.

  • Ensured data movement is not paused to replicas due to internal log block errors.

  • Eliminated schema lock contention problems on secondary replicas (this problem is also fixed in SQL Server 2019).

We also introduced several diagnostic changes in SQL Server 2022, which were also included in the latest cumulative update for SQL Server 2019:
  • Capture the sp_server_diagnostics event in case of errors in the Alwayson_health Extended Events session.

  • Add error information in the Alwayson_health Extended Events session for scenarios where the worker pool of threads was exhausted for availability groups.

  • Add a new Extended Event, hadr_trace_message, to the Alwayson_health Extended Events session for detailed tracing.

  • Increase the default size and number of files retained for the Alwayson_health Extended Events session.

  • Remove unnecessary ERRORLOG entries for Always On Availability Group activity.

  • Add more diagnostics to the Alwayson_health Extended Events session for connection timeout errors.

You can see that the majority of diagnostic improvements are related to the Alwayson_health Extended Events session. You can read more about the Alwayson_health session and its importance for debugging issues at https://docs.microsoft.com/sql/database-engine/availability-groups/windows/always-on-extended-events#BKMK_alwayson_health.

Distributed Availability Group (DAG) Enhancements

In SQL Server 2016 we introduced the concept of a Distributed Availability Group (DAG). A DAG effectively is two availability groups connected together, typically over a long distance. A common scenario in which you might use a DAG is for disaster recovery. You configure an availability group (AG) in your primary location. You configure a second AG in a remote location. You then use a DAG to tie them together but with async commit. If you want to learn the fundamentals of DAG, you can start at https://docs.microsoft.com/sql/database-engine/availability-groups/windows/distributed-availability-groups.

A DAG also supports sync commit. You should carefully consider this option depending on the connectivity to the secondary AG that is part of the DAG. By default, even a sync replica model for a single AG does not wait for a transaction to be committed on the secondary before allowing a transaction on the primary to continue. It only waits for the log records to be hardened on the secondary. However, in SQL Server 2017 we introduced an option called REQUIRED_SYNCHRONIZED_SECONDARIES_TO_COMMIT. This option allows you to set a minimum number of synchronous secondary replicas required to commit before the primary commits a transaction. The default is 0. Previously this option was only allowed on an availability group. In SQL Server 2022 you can configure this option for a Distributed Availability Group.

Another subtle, yet important, change for DAG is that we can use multiple TCP connections to improve throughput for latency-sensitive connections. We found out we can improve our own Azure geo-replication with this design, so we have brought it to SQL Server. No additional configuration is required. We internally decide how to use this to improve throughput.

Recovery Enhancements

The time it takes to recover a SQL Server database is usually not something you spend much time thinking about until (a) an event requires recovery and (b) it takes a long time to recovery, thereby causing you to have unexpected downtime.

We have two areas we have enhanced in SQL Server 2022 for recovery-related operations: Accelerated Database Recovery (ADR) and parallel redo operations.

Accelerated Database Recovery (ADR) Enhancements

I believe one of the most important new capabilities for SQL Server 2019 was Accelerated Database Recovery (ADR). No longer will you encounter transaction log size growing out of control, long rollbacks causing downtime, or recovery taking forever (I encourage you to read my article from 2019 on the topic at www.linkedin.com/pulse/sql-server-2019-how-saved-world-from-long-recovery-bob-ward). And if you want to dig deeper, check out the whitepaper at https://aka.ms/sqladr.

ADR has so many benefits, but there is always room for improvement. As such, in SQL Server 2022 we have made a few enhancements that fall under two categories:
  • Cleanup enhancements

    These enhancements include more efficient cleanup inside user transactions, more efficient cleanup to reduce page footprint, and multi-threaded version cleanup, including a thread per database. We have a new sp_configure option called “ADR Cleaner Thread Count,” which allows more cleanup threads per database. This option may be useful for larger databases or those with a high rate of change.

  • Versioning enhancements

    These enhancements include reducing the memory footprint to track versions and a reduction in the growth required for the Persisted Version Store (PVS).

You can read all about these improvements in SQL Server 2022 in more detail at https://docs.microsoft.com/sql/relational-databases/accelerated-database-recovery-concepts#adr-improvements-in-.

Parallel Redo Enhancements

In SQL Server 2016, my colleague Bob Dorr and I started our own grassroots campaign called “It just runs faster,” calling out enhancements in SQL Server 2016 that were built into the engine and just made your application and queries run faster. One of the features was called parallel redo. The concept is that for redo operations during recovery, we can safely redo committed transactions in parallel with multiple threads.

To see this in action, I went to the video vault and found about a session I did titled SQL Server 2016 It just runs faster on a virtual webinar series originally hosted by Brent Ozar called GroupBy. You can find the video at https://youtu.be/pTEDfmQnpzA. You can watch the entire talk, or for parallel redo fast forward to about 22 minutes into the presentation. You can also get the deck and demos I used for that talk at https://aka.ms/bobwardms. Look for the GroupBy Org Jan 2017 folder.

Note

Fun fact: My camera for this video presentation was at my office at Microsoft in Irving, Texas, way back in 2017. It is definitely an archive, as the building has been redesigned and that office does not exist anymore.

In the presentation I explained how parallel redo can really boost recovery performance. However, we limited the worker pool for parallel threads to 100, so for systems with a large number of databases running recovery, some would not be able to take advantage of parallel redo threads (and you cannot configure which databases can).

In SQL Server 2022, we no longer limit this thread pool, so all databases can take advantage of parallel redo. We have also introduced a concept of batching with parallel redo to increase concurrency and speed up all operations. None of these improvements require any change or configuration on your part. It just works.

Backup/Restore Enhancements

We have made several enhancements to backup and restore operations core to the engine including cross-platform snapshots, hardware offloading for compression, and improved backup metadata.

Note

Chapter 7 covers new capabilities to back up and restore to and from S3 object storage providers.

Cross-Platform SNAPSHOT Backup

Let’s say you have a very large database to back up. SQL Server by default streams the backup by reading the database and log files and copying the data to the backup target. A very long time ago (I had to check my records because this goes back to SQL Server 7.0), we introduced the concept of a Virtual Device Interface (VDI) for backup and restore. The concept is that a developer could write code based on published libraries to accept a backup stream (or send a stream for restore) and process this backup stream in whatever way they wanted to (e.g., send the stream to a special storage device). The T-SQL language was extended to support a VIRTUAL_DEVICE target in addition to what we already supported (DISK, TAPE, PIPE; I know TAPE and PIPE. Crazy!). When a VDI target was used, SQL Server would engage in a protocol with the VDI program to back up or restore a database, log, or file. As part of this protocol, data would be streamed to the VDI program for backup or restore.

With the introduction of VDI came the ability to perform a database backup snapshot. If the VDI program was working in coordination with a device that could simply “copy the files,” a snapshot backup could be significantly faster than a streaming backup. This required a new syntax for BACKUP (WITH SNAPSHOT) and a new protocol with VDI programs. The key to making a snapshot work is that the SQL Server engine has to freeze all I/O for the database so that the database backup snapshot is consistent.

We also introduced on Windows the ability to use the Windows Volume Snapshot Service (VSS) to work in coordination with a program called SQL Writer to perform database backup snapshots, and you can read more at https://docs.microsoft.com/sql/relational-databases/backup-restore/sql-server-vss-writer-backup-guide.

In SQL Server 2016 we introduced the ability to store database and log files in Azure Blob Storage, which allowed us to support file-based snapshot backups in Azure Virtual Machines. Read more at https://docs.microsoft.com/sql/relational-databases/backup-restore/file-snapshot-backups-for-database-files-in-azure.

With all this context in mind, we wanted a simpler way to allow snapshot backups both in Azure Virtual Machines and across operating systems without requiring a VDI application or Windows VSS. As Ravinder Vuppula, Principal Software Engineer on the Azure Data team, tells it:

SQL Server has allowed users to perform snapshot backups for some time. However, it required users to either write a custom VDI client (complex) or use SQLWriter (less flexible) to perform snapshot backups. Starting with SQL Server 2016, users could also perform T-SQL based snapshot backups with FILE_SNAPSHOT feature but it was limited to files stored on Azure Blob Storage. Furthermore, users were limited to performing snapshot backup of one database at a time even when there could be multiple databases placed on the same underlying storage The snapshot backups feature in SQL Server 2022 is designed to remove these restrictions. We were in pursuit of a solution that was easy to use, storage agnostic (as long as the underlying storage allowed snapshots), OS agnostic (wherever SQL Server can run), API agnostic (flexibility to use either VDI client or just T-SQL without any dependency on the API calls to the underlying storage). Of course, we wanted the solution to be reliable (prevent any kind of deadlocks) but also offered the ability to perform snapshot backup of either a single database or multiple databases at the same time.

How Does It Work?

This new capability uses a combination of T-SQL commands and the underlying storage layer for your database and log files to capture a snapshot of the files.

Here is the basic sequence for how this works:

Tip

Want to test this yourself on a disk system that doesn’t have underlying snapshot support? Start SQL Server with the undocumented trace flag 3661. This allows SQL Server to “share” database and log files. It will allow you to use a standard file copy once you have suspended I/O. This trace flag is not documented, not supported, and definitely not recommended for production. But it gives us a nice and easy way to test this new capability.

  1. 1.
    A user executes a new T-SQL statement like the following:
    ALTER DATABASE <db>
    SET SUSPEND_FOR_SNAPSHOT_BACKUP = ON

    SQL Server will freeze I/O at this point. Note that in my testing, most write operations from user transactions are blocked on log writes, so you will see WRITELOG wait types.

    A new Dynamic Management View, sys.dm_server_suspend_status, shows more details on the status of any suspended databases for snapshot backups.

     
  2. 2.

    You can then use a supported storage provider that supports snapshots (or a system like Azure storage, which supports taking disk snapshots).

    Note You can see a sample PowerShell script that supports the new backup snapshot support on Azure at https://github.com/microsoft/sql-server-samples/blob/master/samples/features/t-sql-snapshot-backup/snapshot-backup-restore-azurevm-single-db.ps1.

     
  3. 3.
    To “unfreeze” I/O, a user would execute the following T-SQL statement to back up metadata information about the backup:
    BACKUP DATABASE <db>
    TO DISK='<path><db>.bkm'
    WITH METADATA_ONLY, FORMAT
    If the snapshot failed, you could also unfreeze the database with the following T-SQL statement:
    ALTER DATABASE <db>
    SET SUSPEND_FOR_SNAPSHOT_BACKUP = OFF

    You will use the .bkm file when doing a snapshot restore.

     

There are some interesting options with this new capability, including the ability to perform a backup snapshot of databases into groups and even suspend all databases at the instance level.

Read all about the new T-SQL-based snapshot backups at https://docs.microsoft.com/sql/relational-databases/backup-restore/create-a-transact-sql-snapshot-backup.

Intel QuickAssist (QAT) Backup Compression

When any software performs operations like compression or encryption, the instructions to support these are very CPU intensive. When SQL Server creates a backup with compression or encryption, the threads that perform the backup run code that chews up CPU cycles on the main CPU cores.

Our engineering team at Microsoft is always looking for hardware innovations that can assist in offloading the CPU resources required for operations like these for main CPU cycles. We call this concept hardware offloading.

One of the partners we work closely with is Intel. Most recently, our engineering team worked with Intel to integrate a technology called Intel QuickAssist (QAT) that can perform hardware offloading for compression and encryption needs. You can read more about how Intel QuickAssist works at www.intel.com/content/www/us/en/architecture-and-technology/intel-quick-assist-technology-overview.html. You can see from their documentation that in some cases the technology is built into the main CPU structure and in other cases you can install a separate hardware card (in addition, this technology has what is called software mode, which does use core CPU cycles).

To perform this integration for backup and restore (decompress/unencrypt), we needed to do the following:
  • Build an infrastructure to load accelerators for hardware offloading.

  • Determine how to make the right API calls to use Intel QuickAssist technologies (aka drivers) and load this into SQL Server.

  • Add configuration options to allow hardware offloading and specific accelerators.

  • Enhance the BACKUP and RESTORE syntax to support COMPRESSION options for a new algorithm called QAT.

While we have both observed in some cases better backup throughput and better compression using QAT, the big key to this technology is offloading CPU resources from SQL Server so queries and workloads can maximize available CPU resources.

QAT does come with a software mode, in which the drivers are used but hardware is not. This will not help reduce CPU resources since it is all built into the software, but it can be more effective than the default compression used today by SQL Server. By default, SQL Server uses the XPRESS compression library and algorithms. You can read more about XPRESS compression at https://docs.microsoft.com/windows/win32/cmpapi/-compression-portal.

I’ve included a T-SQL notebook created by David Pless that shows you all the different syntax options and configurations for QAT and hardware offloading in the samples at ch06_meatandpotatoesavailability Intel_QuickAssistTech_SQLServer2022.ipynb. You can view this notebook with Azure Data Studio or a web browser.

You can read performance information using QAT for backups that Intel observed at https://community.intel.com/t5/Blogs/Tech-Innovation/Data-Center/Accelerate-your-SQL-Server-2022-database-backups-using-Intel/post/1335102.

David did share some interesting observations with me:
  • If your system is idle (no CPU load from SQL workloads), you usually won’t get a huge benefit of QAT in hardware mode. This all depends on the available compute and memory of your device as compared with the server on which SQL Server is running.

  • You get the most out of QAT in hardware mode when you back up to multiple files (because we will use different threads all doing compression) and you have a fairly CPU-intensive workload. Additionally, the file count increases the buffercount, which can have a significant impact on backup performance.

  • QAT in software mode will usually outperform the default XPRESS compression in various combinations.

Tip

From David Pless: If you install the drivers and no QAT hardware is detected, you are allowed to install in software mode. The latest QAT drivers can be found at www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html.

Backup Metadata

Have you ever found yourself wondering when you can perform the most recent point-in-time restore (PITR) of a database? We now have a nice addition to the msdb.backupset table, the column named last_valid_restore_time. With this column you can determine exactly the last valid time to which you can perform a PITR. You can find this column now documented for the msdb.backupset table at https://docs.microsoft.com/sql/relational-databases/system-tables/backupset-transact-sql.

Multi-write Replication

One interesting enhancement for SQL Server replication is multi-write replication with Last Writer Wins (LWW). This enhancement is specifically targeted for peer-to-peer replication. Peer-to-peer replication allows for multiple writers and readers against the same set of data across SQL instances (read all the details of peer-to-peer replication at https://docs.microsoft.com/sql/relational-databases/replication/transactional/peer-to-peer-transactional-replication.) Any type of system like this needs conflict resolution. If two users update the same row at the same time, who wins? Prior to these enhancements, conflict resolution effectively paused the entire system, and a manual intervention was required that was not logical (based on node ID). With this enhancement, conflict detection is automatic, and the last user to make the change (based on a synchronized timestamp) wins the conflict.

As we designed project Dallas, we decided that this feature should also be included in an update to SQL Server 2019, so this is not technically new to SQL Server 2022. Check out more details in this blog post by Kevin Farlee at https://techcommunity.microsoft.com/t5/sql-server-blog/replication-enhancements-in-the-sql-server-2019-cu13-release/ba-p/2814727.

Other Engine “Stuff”

We have other engine enhancements that target specific customer challenges with solutions based on our experiences over the years or directly from user feedback. It’s possible that one of the improvements listed in the following is one you have been looking to see in SQL Server.

XML Compression

XML native data types are popular among some developers, but XML data can get quite large. Features like SQL Server row compression don’t affect XML data types, and XML indexes cannot be compressed. Now in SQL Server 2022, we have introduced the ability to compress an XML column, which can greatly reduce the size of your data that uses XML data types. We enhanced the CREATE TABLE statement to support an XML_COMPRESSION keyword. You can read more at https://docs.microsoft.com/sql/t-sql/statements/create-table-transact-sql.

The same concept exists to compress an index that contains an XML column, which you can read about at https://docs.microsoft.com/sql/t-sql/statements/create-index-transact-sql.

We also support XML_COMPRESSION for both ALTER TABLE and ALTER INDEX statements.

In-Memory OLTP Memory Management

We built enhancements in SQL Server 2022 to improve our memory management for In-Memory OLTP operations. Most of these are specific improvements to avoid out-of-memory (OOM) scenarios and don’t require any action from you; they are just built into the engine.

We also introduced a new system procedure in SQL Server 2019, which is also available in SQL Server 2022, called sys.sp_xtp_force_gc. This procedure takes as input a database name to release unused memory on demand, without memory pressure, for memory-optimized table data.

Auto-drop Statistics

Here is an interesting enhancement to help with a long-standing problem. Let’s say SQL Server auto-creates statistics on a column based on a query or a third-party software program creates statistics on a column. If you try to then drop the column these statistics are based on (or make other schema changes that affect the column), you may encounter an error like this:
Msg 5074, Level 16, State 1, Line 20
The statistics '<stats name> is dependent on column '<column name>'.

You then need to go and drop the statistics first. Now you can create statistics using the WITH AUTO_DROP = ON option. In the preceding scenario, the statistics would be automatically dropped and not block the schema change. Auto-created statistics are not enabled with auto-drop by default.

Resumable Add Table Constraints

In SQL Server 2017 and 2019 we added the ability to create and rebuild indexes with a resumable option. This means if an error occurs during the build of an index (or you pause it), we can resume “where we left off” vs. having to start over.

This is quite a nice feature for index builds that take a very long time. You can read more about resumable indexes at https://docs.microsoft.com/sql/relational-databases/indexes/guidelines-for-online-index-operations?#resumable-index-considerations.

We realize though that many indexes are added as constraints, so we have enhanced the syntax for ALTER TABLE ADD CONSTRAINT to support resumable operations. You can read the exact syntax at https://docs.microsoft.com/sql/relational-databases/security/resumable-add-table-constraints.

New Wait Types

With each new release of SQL Server, we add new wait types as part of building new features. I found around ~300 new wait types in SQL Server 2022 vs. SQL Server 2019.

One observation I had when reviewing these new types is that there are new PREEMPTIVE_* wait types for SQL Server 2022. Wait types with this name indicate places in our code where a worker is switched to “preemptive” mode and no longer part of SQLOS scheduling. This is because we are typically calling an API or running code where we can’t control thread scheduling. Two new wait types that indicate preemptive scheduling are PREEMPTIVE_SYNAPSESTREAMING_HTTP_EVENT_WAIT and PREEMPTIVE_AAD_HTTP_EVENT_WAIT, which can show up when using Synapse Link or Azure Active Directory (AAD) authentication.

New Extended Events

Almost every new feature has a set of new Extended Events, and you have already seen some related to IQP features in Chapters 4 and 5 as well as several in this chapter.

My incredible colleague and longtime friend Robert Dorr told me about one interesting new event, not related to a feature but one he added to help with query performance troubleshooting and debugging. The name of the event is query_abort (it is a debug channel event). This event includes the session, the input buffer, and even a callstack and fires whenever a query is aborted for any reason. If you are seeing queries that are canceled, aborted, or timed out and are not sure why, this new event can probably help.

An Industry-Proven Engine

The #sqllegend David Campbell once told me, “Bob, if we don’t innovate in the engine, we don’t have a product.” I think you can see from this chapter that we have provided plenty of innovation into security, scalability, and availability. Think about what you have seen in this chapter: Ledger for SQL Server, “hands-free” tempdb, and Contained Availability Groups just to name a few.

My colleague David Pless spent maybe more than anyone on our team across the new SQL Server 2022 engine features. I asked him for his perspective:

I have been thinking about the theme around what we have accomplished in this release, especially thinking back over the previous releases. In SQL Server 2016, it was “It's just faster” and we said in SQL Server 2017 it was “all about choice.” In this release it is a “return to performance,” but more than thatwhat we have done is made a firm commitment to scalability. The commitment for scalability is more than a single feature, but several features that give SQL Server 2022 the ability to take the most advantage of our largest systems for our most critical workloads. We did it with buffer pool parallel scan by allowing customers to be able to scale up to the largest memory servers available and get the full performance they would expect for SQL Server. We did it with concurrent GAM and SGAM updates improving tempdb performance at a factor that we are even questioning whether we still need the tempdb best practices that have stood true for nearly a quarter of a century. And both of these major improvements are on by default – you just have to get your data on SQL Server 2022 and let the scalable SQL Server engine do the rest. And there is so much more. Accelerated Database Recovery improvements recovers our databases faster than ever. Additionally, our partnership with Intel introduces a brand-new feature to help SQL Server not only improve backup capacity and throughputbut with Intel QuickAssist Technology hardware we can protect the CPU of the host machine by offloading the compute overhead of the backup process away from the host SQL Server system. I am proud of this team; I am proud of what we have accomplished with this release.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.134.99.32