IBM Real-time Compression
This chapter highlights the preferred practices for IBM Real-time Compression that uses IBM Spectrum Virtualize software installed on IBM SAN Volume Controller, IBM Storwize family, and IBM FlashSystem V9000. The main goal is to provide compression users with guidelines and factors to consider to achieve the best performance results and enjoy the compression savings that the Real-time Compression technology offers.
This chapter assumes that the reader is already familiar with IBM Spectrum Virtualize Real-time Compression technology. Information on this technology can be found in many sources, including the following publications:
IBM Real-time Compression in IBM SAN Volume Controller and IBM Storwize V7000, REDP-4859
Implementing IBM Real-time Compression in SAN Volume Controller and IBM Storwize V7000, TIPS1083
This chapter includes the following sections:
11.1 Evaluate compression savings using Comprestimator
Before you use Real-time Compression technology, it is important to understand the typical workloads you have in your environment. You need to determine whether these workloads are a good candidate for compression. You should then plan to implement workloads that are suitable for compression.
To determine the compression savings you are likely to achieve for the workload type, IBM has developed an easy-to-use utility called IBM Comprestimator. The utility uses advanced mathematical and statistical algorithms to perform the sampling and analysis process in a short and efficient way. The utility also displays its accuracy level by showing the maximum error range of the results based on the internal formulas. The utility performs only read operations, so it has no effect on the data that is stored on the device.
From IBM Spectrum Virtualize version 7.6, the Comprestimator utility can be used directly from the IBM Spectrum Virtualize shell. Example 11-1 show the CLI commands to use the utility.
Example 11-1 Estimating compression savings from the CLI
IBM_Storwize:Spectrum_Virtualize_Cluster:user>analyzevdisk 0
IBM_Storwize:Spectrum_Virtualize_Cluster:user>lsvdiskanalysisprogress
vdisk_count pending_analysis estimated_completion_time
1 1 161014214700
IBM_Storwize:Spectrum_Virtualize_Cluster:user>lsvdiskanalysis -nohdr
0 vdisk0 sparse 161014214659 100.00GB 0.00MB 0.00MB 0 0.00MB 0.00MB 0 0.00MB 0 0
From IBM Spectrum Virtualize version 7.7, the Comprestimator utility can be used directly from the IBM Spectrum Virtualize GUI. Figure 11-1 shows how to start a system-wide analysis of compression estimates by clicking Volumes  Actions → Space Savings  Estimate Compression Savings.
Figure 11-1 Estimating compression savings from the GUI
If using an older IBM Spectrum Virtualize version or if you want to estimate the compression savings of a different storage system before changing to IBM Spectrum Virtualize, the Comprestimator utility can be installed on a host that has access to the devices that are analyzed. More information together with the latest version can be found at this website:
These are the preferred practices for using Comprestimator:
Run the Comprestimator utility before you implement an IBM Spectrum Virtualize solution and before you implement the Real-time Compression technology.
Download the latest version of the utility from IBM if you are not using the version included with IBM Spectrum Virtualize.
Use Comprestimator to analyze volumes that contain as much active data as possible rather than volumes that are mostly empty. This technique increases the accuracy level and reduces the risk of analyzing old data that is deleted but might still have traces on the device.
 
Note: Comprestimator can run for a long period (a few hours) when it is scanning a relatively empty device. The utility randomly selects and reads 256 KB samples from the device. If the sample is empty (that is, full of null values), it is skipped. A minimum number of samples with actual data are required to provide an accurate estimation.
When a device is mostly empty, many random samples are empty. As a result, the utility runs for a longer time as it tries to gather enough non-empty samples that are required for an accurate estimate. If the number of empty samples is over 95%, the scan is stopped.
Use Table 11-1 thresholds for volume compressibility to determine whether to compress a volume.
Table 11-1 Thresholds for Real-time Compression implementation
 
Data Compression Rate
Recommendation
On products that have Quick Assist compression acceleration cards installed and are on version 7.4 and later
>40% compression savings
Use compression
<40% compression savings
Evaluate workload
On all other products
>25% compression savings
Use compression
<25% compression savings
Evaluate workload
11.2 Evaluate workload using Disk Magic
Proper initial sizing greatly helps to avoid future sizing problems. Disk Magic is one such tool that is used for sizing and modeling storage subsystems for various open systems environments and various IBM platforms. It provides accurate performance and capacity analysis and planning for IBM Spectrum Virtualize products, other IBM storage solutions, and other vendors’ storage subsystems. Disk Magic allows for in-depth environment analysis and is an excellent tool to estimate the performance of a system that is running Real-time Compression.
If you are an IBM Business Partner, more information together with the latest version can be found at this website:
If you are an IBM customer, ask an IBM representative to evaluate the workload of your storage environment when implementing an IBM Spectrum Virtualize Real-time Compression solution.
11.3 Verify available CPU resources
Before compression is enabled on IBM Spectrum Virtualize systems, measure the current system utilization to ensure that the system has the CPU resources that are required for compression.
Compression is recommended for an I/O Group if the sustained CPU utilization is below the per node values that are listed in Table 11-2. For node types for which the value listed is N/A, Real-time Compression can be implemented with no consideration regarding CPU utilization. This is because these node types have dedicated CPU resources for Real-time Compression.
Table 11-2 CPU resources recommendations
SAN Volume Controller
Storwize
IBM Spectrum
Virtualize
Software
CF8 & CG8
(4 core)
CG8
(6 core)
CG8
(12 core)
DH8
(Dual CPU)
SV1
V5030
V7000
Gen1
V7000
Gen2/Gen2+
25%
30%
N/A
N/A
N/A
30%
25%
50%
30%
If any node in a particular I/O Group already has sustained processor utilization greater than the values in Table 11-2, do not create compressed volumes in this I/O Group. Doing so might affect existing non-compressed volumes that are owned by this I/O Group. If it is an option, add more I/O groups. If you have any questions, speak to your IBM representative.
Customers who are planning to use Real-time Compression on 6-core SAN Volume Controller CG8 nodes should enhance their system with more CPU and cache memory resources that are dedicated to Real-time Compression. This upgrade preserves full performance and resources for non-compressed workloads. Information about upgrading to SAN Volume Controller CG8 dual CPU model is available with RPQ #8S1296.
Customers who are planning to use Real-time Compression on V7000 Gen2/Gen2+ should install the extra Quick Assist compression acceleration card per node canister for better performance.
 
Note: To use the Real-time Compression feature on SAN Volume Controller DH8 and SV1 nodes, at least one Quick Assist compression acceleration card is required. To use the IBM Real-time Compression feature on the V9000 system, both Quick Assist compression acceleration cards are required.
 
11.4 Configure a balanced system
In a system with more than one IO group, it is important to balance the compression workload. Consider a four-node (two IO groups) IBM Spectrum Virtualize system with the following configuration:
iogrp0: nodes 1 and 2 with 18 compressed volumes
iogrp1: nodes 3 and 4 with two compressed volumes
This setup is not ideal because CPU and memory resources are dedicated for compression use in all four nodes. However, in nodes 3 and 4, this allocation is used only for serving two volumes out of a total of 20 compressed volumes. The following preferred practices in this scenario should be used:
Alternative 1: Migrate all compressed volumes from iogrp1 to iogrp0 when there are only a few compressed volumes (that is, 10 - 20).
Alternative 2: Migrate compressed volumes from iogrp0 to iogrp1 and balance the load across nodes when there are many compressed volumes (that is more than 20).
Table 11-3 shows the load distribution for each alternative.
Table 11-3 Load distribution
 
node1 volumes
node2 volumes
node3 volumes
node4 volumes
Original setup
9 compressed
X non-compressed
9 compressed
X non-compressed
1 compressed
X non-compressed
1 compressed
X non-compressed
Alternative 1
10 compressed
X non-compressed
10 compressed
X non-compressed
X non-compressed
X non-compressed
Alternative 2
5 compressed
X non-compressed
5 compressed
X non-compressed
5 compressed
X non-compressed
5 compressed
X non-compressed
11.5 Standard benchmark tools
Traditional block and file-based benchmark tools (such as IOmeter, IOzone, dbench, and fio) that generate truly random but not realistic I/O patterns do not run well with Real-time Compression.
These tools generate synthetic workloads that do not have any temporal locality. Data is not read back in the same (or similar) order in which it was written. Therefore, it is not useful to estimate what your performance looks like for an application with these tools. Consider what data a benchmark application uses. If the data is already compressed or it is all binary zero data, the differences that are measured are artificially bad or good, based on the compressibility of the data. The more compressible the data, the better the performance.
11.6 Compression with FlashCopy
By using the FlashCopy function of IBM Storage Systems, you can create a point-in-time copy of one or more volumes. You can use FlashCopy to solve critical and challenging business needs that require duplication of data on your source volume. Volumes can remain online and active while you create consistent copies of the data sets.
Follow these general guidelines:
Consider configuring FlashCopy targets as non-compressed volumes. In some cases, the savings are not worth the other resources that are required because the FlashCopy target holds only the “split” grains that are backing the grains that were changed in the source. Therefore, total FlashCopy target capacity is a fraction of the source volume size.
FlashCopy default grain size is 256 KB for non-compressed volumes and 64 KB for compressed volumes (new defaults from version 6.4.1.5 and 7.1.0.1 and later). Use the default grain size for FlashCopy with compressed volumes (64 KB) because this size reduces the performance effect when compressed FlashCopy targets are used.
Consider the use of the background copy method. There are two ways to use FlashCopy: With or without background copy. When it is used without background copy, the host I/O is pending until the split event is finished. For example, if the host sends a 4 KB write, this I/O waits until the corresponding grain (64 KB or 256 KB) is read and decompressed. It is then written to FlashCopy target copy. This configuration adds latency to every I/O. When background copy is used, all the grains are copied to the FlashCopy target right after the FlashCopy mapping is created. Although the configuration adds latency during the copy, it eliminates latency after the copy is complete.
11.7 Compression with Easy Tier
IBM Easy Tier is a performance function that automatically and nondisruptively migrates frequently accessed data from magnetic media to solid-state drives (SSDs). In that way, the most frequently accessed data is stored on the fastest storage tier and the overall performance is improved.
Beginning with version 7.1, Easy Tier supports compressed volumes. A new algorithm is implemented to monitor read operations on compressed volumes instead of reads and writes. The extents with the most read operations that are smaller than 64 KB are migrated to SSD MDisks. As a result, frequently read areas of the compressed volumes are serviced from SSDs. Easy Tier on non-compressed volumes operates as before and it is based on read and write operations that are smaller than 64 KB.
For more information about implementing IBM Easy Tier with IBM Real-time Compression, see Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072.
11.8 Compression on the backend
If you have an IBM Spectrum Virtualize system setup with some backend storage that supports compression (such as a Storwize product) and you plan to implement compression, configure compression volumes on the IBM Spectrum Virtualize system, not on the backend storage. This configuration minimizes I/O to the backend storage.
From version 7.3, the existence of a lower-level write cache below the Real-time Compression component in the software stack allows for the coalescing of compressed writes. As a result, an even bigger reduction in back-end I/Os is achieved because of the ability to perform full-stride writes for compressed data.
11.9 Migrating generic volumes
It is possible to migrate non-compressed volumes, both generic (fully allocated) or thin-provisioned, to compressed volumes by using volume mirroring. When migrating generic volumes that are created without initial zero formatting, extra considerations need to be taken into account. These volumes might contain traces of old data at the block device level. Such data is not accessible or viewable in the file system level. However, it might affect compression ratios and system resources during and after migration.
When using the Comprestimator utility to analyze such volumes, the expected compression results reflect the compression rate for all the data in the block device level. This data includes the old data. This block device behavior is limited to generic volumes, and does not occur when using Comprestimator to analyze thin-provisioned volumes.
The second issue is that old data is also compressed. Therefore, system resources and system storage space are wasted on compression of old data that is effectively inaccessible to users and applications.
 
Note: Regardless of the type of block device that is analyzed or migrated, it is also important to understand a few characteristics of common file systems space management.
When data is deleted from a file system, the space that it occupied before it was deleted is freed and available to the file system. It is available even though the data at block device level was not deleted. When using Comprestimator to analyze a block device or when migrating a volume that is used by a file system, all underlying data in the device is analyzed or migrated regardless of whether this data belongs to files that were deleted from the file system. This process affects even thin-provisioned volumes.
There is not a solution for existing generic volumes that were created without initial zero formatting. Migrating these volumes to compressed volumes might still be a good option and should not be discarded.
As a preferred practice, always format new volumes during creation. This process zeros all blocks in the disks and eliminates traces of old data. This is the default behavior from version 7.7.
11.10 Mixed volumes in the same MDisk group
 
Note: IBM Spectrum Virtualize version 7.3 onwards include a new cache architecture that is not affected by mixing compressed and non-compressed volumes in the same MDisk group. The following recommendation only applies to version 7.2 and earlier.
Consider a scenario in which hosts are sending write I/Os. If the response time from the backend storage increases above a certain level, the cache destaging to the entire pool is throttled down and the cache partition becomes full. This situation occurs under the following circumstances:
In Storwize V7000: If the backend is HDD and latency is greater than 300 ms.
In Storwize V7000: If the backend is SSD and latency is greater than 30 ms.
In SAN Volume Controller: If the latency is greater than 30 ms.
From version 6.4.1.5 to 7.2, the following thresholds changed for both Storwize V7000 and SAN Volume Controller:
For pools containing only compressed volumes, the threshold is 600 ms.
For mixed pools, issue the following command to change to 600 ms system-wide:
chsystem -compressiondestagemode on
To check the current value, issue these commands:
lssystem | grep compression_destage
compression_destage_mode on
With the new threshold, the compression module receives more I/O from cache, which improves the overall situation.
With V7.1 and later, performance improvements were made that reduce the probability of a cache throttling situation. However, in heavy sequential write scenarios, this behavior of full cache can still occur and the parameter that is described in this section can help to solve this situation.
If none of these options help, separate compressed and non-compressed volumes to different storage pools. The compressed and non-compressed volumes do not share the cache partition, and so the non-compressed volumes are not affected.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.255.44