Using COMPRESS

The COMPRESS function is a scalar function and compresses the input variable, column, or expression using the GZIP algorithm. The function accepts an expression, which can be either string or binary, but again, the deprecated data types text, ntext, and image are not supported.

The return type of the function is VARBINARY(MAX).

Use this function with wide text columns, especially when you do not plan to query them often. For large strings, the compression rate can be significant, particularly when the original string is XML. Here is an example of significant compression. The example uses the output of the system Extended Event session system_health to check the compression rate when you use the COMPRESS function for the target_data column. Here is the code:

SELECT 
  target_name, 
  DATALENGTH(xet.target_data) AS original_size, 
  DATALENGTH(COMPRESS(xet.target_data)) AS compressed_size, 
  CAST((DATALENGTH(xet.target_data) - DATALENGTH(COMPRESS(xet.target_data)))*100.0/DATALENGTH(xet.target_data) AS DECIMAL(5,2)) AS compression_rate_in_percent 
FROM sys.dm_xe_session_targets xet   
INNER JOIN sys.dm_xe_sessions xe ON xe.address = xet.event_session_address   
WHERE xe.name = 'system_health'; 

The following is the output generated by this query on my test server. You might get a different output, but similar results:

target_name  original_size   compressed_size  compression_rate_in_pct
------------ --------------  ---------------- -----------------------
ring_buffer  8386188         349846           95.83
event_file   410             235              42.68

You can see a quite impressive compression rate of 96%. On the other hand, the compressed representation of a short string can be even longer than the original. Consider the following example, where a short string with a size of 30 bytes is used as input:

DECLARE @input AS NVARCHAR(15) = N'SQL Server 2017'; 
SELECT @input AS input, DATALENGTH(@input) AS input_size, COMPRESS(@input) AS compressed, DATALENGTH(COMPRESS(@input)) AS comp_size;

The result of this query (with abbreviated compressed value) is:

input           input_size  compressed                        comp_size
------------    ----------- --------------------------------  -------
SQL Server 2017 30 0x1F8B08000000000004000B660864F061... 46

The COMPRESS function is not a replacement for row or page compression. It is invoked for a single expression and additional optimizations are not possible (exactly the same string tokens exist in another row or column).

To compare compression rates for the Row and Page compressions on one side and the compression by the COMPRESS function on the other side, I have created four clone tables of the system table sys.messages. I have left one uncompressed and have compressed the other three with ROW, PAGE, and COMPRESS functions, respectively. The complete code for creating and populating tables, as well as for comparing compression methods, can be found in the code accompanying this book. The following screenshot displays the result of this comparison:

Comparing compression rates between Row, Page, and Compress

You can see that (slightly) more compression can be achieved using Row and Page compression, but a notable compression is also obtained using the COMPRESS function.

Use this function when you want to save some storage space or to compress data that needs to be archived or logged and is thus rarely queried. Since it uses a common and well-known GZIP algorithm, you can compress/decompress data not only in SQL Server but also in client applications and tools communicating with SQL Server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.61.30