Chapter 13 Managing and monitoring SQL Server

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 13
Managing and monitoring SQL Server

Detecting database corruption

Maintaining indexes and statistics

Maintaining database file sizes

Monitoring databases by using DMVs

Reintroducing extended events

Capturing Windows performance metrics with DMVs and data collectors

Protecting important workloads by using Resource Governor

Understanding the new servicing model

Chapter 11 discusses the importance and logistics of database backups, but what else do you need to do on a regular basis to maintain a healthy SQL Server? In this chapter, we lay the foundation for the what and why of Microsoft SQL Server management, including key dynamic management views (DMVs) along the way, and how to set up extended events (the replacement for traces). We review what to look for in the Windows Performance Monitor for SQL Server instances and Database Transaction Unit (DTU) metrics for databases in Microsoft Azure SQL Database. Finally, we review the major changes to the SQL Server servicing model. For example, there are no more service packs: it’s cumulative updates from here out.

Detecting database corruption

Aside from database backups, the second most important factor concerning database page integrity is the proper configuration to prevent, and the monitoring to mitigate, database corruption. This isn’t a complicated topic and mostly revolves around one setting and one command.

Setting the database’s page verify option

For all databases, this setting should be CHECKSUM. The legacy TORN_PAGE option is a sign that this database has been moved over the years, up from a pre–SQL 2005 version, but this setting has never changed. Since SQL 2005, CHECKSUM has the superior and default setting, but it requires an Administrator to manually change after a database is restored up.

If you still have databases with a page verify option that is not CHECKSUM, you should change this setting immediately.

For more information on database files and filegroups, see Chapter 3.

NOTE

Changing the page verify option to CHECKSUM is a quick, unnoticeable change to databases that have no data corruption. However, it is possible that changing a database from NONE or TORN_PAGE to CHECKSUM could result in the discovery of database corruption. This could result in databases immediately becoming inaccessible and in the SUSPECT state. It is paramount that good backups are taken regularly, and before making the change to CHECKSUM. If a database becomes SUSPECT after changing the page verify option to CHECKSUM, you should restore a copy of the database prior to the change and attempt immediate detection and recovery of the lost data.

Using DBCC CHECKDB

You should periodically run CHECKDB on all databases. This is a time-consuming but crucial process. You should run DBCC CHECKDB at least as often as your backup retention plan, and consider DBCC CHECKDB nearly as important as regular database backups. It’s worth noting that the only reliable solution to database corruption is restoring from a known good backup.

For example, if you keep local backups around for one month, you should ensure that you perform a successful DBCC CHECKDB no less than once per month. More often as possible is recommended, of course. This ensures that you will at least have a recovery point for uncorrupted, unchanged data, and a starting point for corrupted data fixes. On large databases, DBCC CHECKDB could take hours and block other user queries.

The DBCC CHECKDB command actually covers other more-granular database integrity check tasks, including DBCC CHECKALLOC, DBCC CHECKTABLE, or DBCC CHECKCATALOG, all of which are important, and in only rare cases need to be run separately or to split up the workload.

Running DBCC CHECKDB, with no other parameters or syntax, performs a database integrity test on the current database context. Without specifying a database, however, no other additional options can be provided. There are a number of parameters for CHECKDB detailed at
https://docs.microsoft.com/sql/t-sql/database-console-commands/dbcc-checkdb-transact-sql.

Here are some parameters worth noting:

NOINDEX. This can reduce the duration of the integrity check by ignoring nonclustered rowstore and Columnstore indexes.

Example usage:

Click here to view code image

DBCC CHECKDB (databasename, NOINDEX);
NO_INFOMSGS. This suppresses informational status messages and returns only errors.

Example usage:

Click here to view code image

DBCC CHECKDB (databasename) WITH NO_INFOMSGS;
REPAIR_REBUILD. You should run this only as a last resort because although it might have some success, it is unlikely to result in a complete repair. It can also be very time consuming, involving the rebuilding of indexes based on attempted repair data. We suggest that you review the DBCC CHECKDB documentation for a number of caveats.

Example usage:

Click here to view code image

DBCC CHECKDB (databasename) WITH REPAIR_REBUILD;
REPAIR_ALLOW_DATA_LOSS. You should run this only as a last resort to achieve a partial database recovery because it could force a database to resolve errors by simply deallocating pages, potentially creating gaps in rows or columns. You must run this in SINGLE_USER mode, and you should run it in EMERGENCY mode. Review the DBCC CHECKDB documentation for a number of caveats. A complete review of how EMERGENCY mode and REPAIR_ALLOW_DATA_LOSS is detailed in this blog post by Paul Randal: https://www.sqlskills.com/blogs/paul/checkdb-from-every-angle-emergency-mode-repair-the-very-very-last-resort/.

Example usage: (last resort only, not recommended!)

Click here to view code image

ALTER DATABASE WorldWideImporters SET EMERGENCY, SINGLE_USER;
DBCC CHECKDB('WideWorldImporters', REPAIR_ALLOW_DATA_LOSS);
ALTER DATABASE WorldWideImporters SET MULTI_USER;
ESTIMATEONLY. This does not provide an estimate of the duration of a CHECKDB (without other parameters), only an amount of space required in TempDB.

Example usage:

Click here to view code image

DBCC CHECKDB (databasename) WITH ESTIMATEONLY;

These scripts are all available in the accompanying downloads for this book at https://aka.ms/SQLServ2017Admin/downloads.

For more information on automating DBCC CHECKDB, see Chapter 14.

Repairing database data file corruption

Of course, the only real remedy to data corruption after it has happened is by restoring from a backup. The well-documented DBCC CHECKDB REPAIR_ALLOW_DATA_LOSS should be a last resort.

It is possible to repair missing pages in clustered indexes by piecing together missing columns in nonclustered indexes. In reality, this is an academic solution because data corruption rarely happens in such a tidy and convenient way.

Always On availability groups also provide a built-in data corruption detection and automatic repair capability by using uncorrupted data on one replica to replace inaccessible data on another.

For more information on this feature of availability groups, see Chapter 12.

Recovering the database transaction log file corruption

In addition to the previous guidance on the importance of backups, you can reconstitute a corrupted or lost database transaction log file (though not recovered) by using the example that follows. A lost transaction log file will likely result in the loss of some recent rows, but in the event of a disaster recovery involving the loss of the .ldf file but an intact .mdf file, this could be a valuable step.

It is possible to rebuild a blank transaction log file in a new file location for a database by using the following command:

Table of Contents for Chapter 13 Managing and monitoring SQL Server

Create new playlist

Sign In

Sign Up

Chapter 13Managing and monitoring SQL Server

Detecting database corruption

Setting the database’s page verify option

Using DBCC CHECKDB

Repairing database data file corruption

Recovering the database transaction log file corruption

Database corruption in databases in Azure SQL Database

Maintaining indexes and statistics

Changing the Fill Factor property when beneficial

Tracking page splits

Using extended events session to identify page_split events

Monitoring index fragmentation

Rebuilding indexes

Reorganizing indexes

Updating index statistics

Reorganizing Columnstore indexes

Maintaining database file sizes

Understanding and finding autogrowth events

Shrinking database files

Monitoring databases by using DMVs

Sessions and requests

Understanding wait types and wait statistics

Wait types that you can safely ignore

Wait types to be aware of

Reintroducing extended events

Viewing extended events data

Understanding the variety of extended events targets

Using extended events to detect deadlocks

Using extended events to detect autogrowth events

Using extended events to detect page splits

Securing extended events

Capturing Windows performance metrics with DMVs and data collectors

Querying performance metrics by using DMVs

Querying performance metrics by using Performance Monitor

Monitoring key performance metrics

Average Disk seconds per Read or Write

Page Life Expectancy (PLE)

Buffer Cache Hit Ratio (BCHR)

Page Reads

Memory Pages

Batch Requests

Page Faults

Available Memory

Total Server Memory

Target Server Memory

Protecting important workloads by using Resource Governor

Configuring the Resource Governor classifier function

Configuring Resource Governor pools and groups

Monitoring pools and groups

Understanding the new servicing model

Updated servicing model

Product support life cycle

Table of Contents for
Chapter 13 Managing and monitoring SQL Server

Chapter 13
Managing and monitoring SQL Server