Chapter 10. Advanced features for storage efficiency

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Advanced features for storage efficiency

This chapter introduces the basic concepts of dynamic data relocation and storage optimization features. The IBM Spectrum Virtualize software running inside IBM Storwize V7000 offers IBM EasyTier, Thin Provisioning, unmap, and IBM Real-time Compression functions for storage efficiency. It provides only a basic technical overview and benefits of each feature. For more information, see these IBM publications:

•EasyTier:

– Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072

– IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521

– IBM DS8000 EasyTier (for DS8880 R8.3 or later), REDP-4667 (this concept is similar to IBM SAN Volume Controller (SVC) EasyTier)

•Thin Provisioning:

– Thin Provisioning in an IBM SAN or IP SAN Enterprise Environment, REDP-4265

– DS8000 Thin Provisioning, REDP-4554 (similar concept to IBM Storwize V7000 thin provisioning)

•Real-Time Compression:

– IBM Real-time Compression in IBM SAN Volume Controller and IBM Storwize V7000, REDP-4859

– Implementing IBM Real-time Compression in SAN Volume Controller and IBM Storwize V7000, TIPS1083

– Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072

This chapter includes the following topics:

•Introduction

•EasyTier

•Thin provisioning

•Unmap

•Real-time Compression

10.1 Introduction

In modern and complex application environments, the increasing and often unpredictable demands for storage capacity and performance lead to issues of planning and optimization of storage resources.

Consider the following typical storage management issues:

•Usually when a storage system is implemented, only a portion of the configurable physical capacity is deployed. When the storage system runs out of the installed capacity and more capacity is requested, a hardware upgrade is implemented to add physical resources to the storage system. This new physical capacity can hardly be configured to keep an even spread of the overall storage resources.

Typically, the new capacity is allocated to fulfill only new storage requests. The existing storage allocations do not benefit from the new physical resources. Similarly, the new storage requests do not benefit from the existing resources. Only new resources are used.

•In a complex production environment, it is not always possible to optimize storage allocation for performance. The unpredictable rate of storage growth and the fluctuations in throughput requirements, which are input/output (I/O) operations per second (IOPS), often lead to inadequate performance.

Furthermore, the tendency to use even larger volumes to simplify storage management works against the granularity of storage allocation, and a cost-efficient storage tiering solution becomes difficult to achieve. With the introduction of high-performing technologies, such as Flash drives or all-flash arrays, this challenge becomes even more important.

•The move to larger and larger physical disk drive capacities means that previous access densities that were achieved with low-capacity drives can no longer be sustained.

•Any business has applications that are more critical than others, and a need exists for specific application optimization. Therefore, the ability to relocate specific application data to a faster storage media is needed.

•Although more servers are purchased with internal SSD drives attached for better application response time, the data distribution across these internal SSD drives and external storage arrays must be carefully planned. An integrated and automated approach is crucial to achieve performance improvement without compromise to data consistency, especially in a disaster recovery (DR) situation.

All of these issues deal with data placement and relocation capabilities, or data volume reduction. Most of these challenges can be managed by having spare resources available, by moving data, and by using data mobility tools or operating systems features (such as host level mirroring) to optimize storage configurations.

However, all of these corrective actions are expensive in terms of hardware resources, labor, and service availability. Relocating data among the physical storage resources that dynamically or effectively reduces the amount of data, transparently to the attached host systems, is becoming increasingly important.

10.2 EasyTier

In today’s storage market, Flash drives and flash arrays are emerging as an attractive alternative to hard disk drives (HDDs). Because of their low response times, high throughput, and IOPS-energy-efficient characteristics, Flash drives and flash arrays have the potential to enable your storage infrastructure to achieve significant savings in operational costs.

However, the current acquisition cost per gibibyte (GiB) for Flash drives or Flash arrays is higher than for HDDs. Flash drive and flash array performance depends on workload characteristics. Therefore, they should be used with HDDs for optimal cost / performance.

Choosing the correct mix of drives and the correct data placement is critical to achieve optimal performance at low cost. Maximum value can be derived by placing “hot” data with high I/O density and low response time requirements on Flash drives or flash arrays, while targeting HDDs for “cooler” data that is accessed more sequentially and at lower rates.

EasyTier automates the placement of data among different storage tiers, and it can be enabled for internal and external storage. This IBM Spectrum Virtualize feature boosts your storage infrastructure performance to achieve optimal performance through a software, server, and storage solution.

Additionally, the no-charge feature called storage pool balancing, introduced in V7.3, automatically moves extents within the same storage tier, from overloaded to less loaded managed disks (MDisks). Storage pool balancing ensures that your data is optimally placed among all MDisks within a storage pool.

10.2.1 EasyTier concepts

IBM Spectrum Virtualize implements EasyTier enterprise storage functions, which were originally available on IBM DS8000® and IBM XIV enterprise class storage systems. It enables automated subvolume data placement throughout different or within the same storage tiers. This process intelligently aligns the system with current workload requirements and optimizes the usage of Flash drives or flash arrays.

This function includes the ability to automatically and non-disruptively relocate data (at the extent level) from one tier to another tier, or even within the same tier, in either direction. This feature helps achieve the best available storage performance for your workload in your environment. EasyTier reduces the I/O latency for hot spots, but it does not replace storage cache.

Both EasyTier and storage cache solve a similar access latency workload problem. However, these two methods weigh differently in the algorithmic construction that is based on locality of reference, recency, and frequency. Because EasyTier monitors I/O performance from the device end (after cache), it can pick up the performance issues that the cache cannot solve, and complement the overall storage system performance.

Figure 10-1 shows placement of the EasyTier engine within the IBM Spectrum Virtualize software stack.

Figure 10-1 EasyTier in the IBM Spectrum Virtualize software stack

In general, the storage environment’s I/O is monitored at a volume level, and the entire volume is always placed inside one appropriate storage tier. Determining the amount of I/O, moving part of the underlying volume to an appropriate storage tier, and reacting to workload changes is too complex for manual operation. This is where the EasyTier feature can be used.

EasyTier is a performance optimization function because it automatically migrates (or moves) extents that belong to a volume between different storage tiers (see Figure 10-2 on page 391) or rebalancing within the same storage tier (see Figure 10-6 on page 395). Because this migration works at the extent level, it is often referred to as sub-logical unit number (LUN) migration. Movement of the extents is done online and is unnoticed from the host point of view. As a result of extent movement, the volume no longer has all its data in one tier, but rather in two or three tiers.

Figure 10-2 shows the basic EasyTier principle of operation.

Figure 10-2 EasyTier

You can enable EasyTier on a volume basis. It monitors the I/O activity and latency of the extents on all EasyTier enabled volumes over a 24-hour period. Based on the performance log, EasyTier creates an extent migration plan and dynamically moves (promotes) high activity or hot extents to a higher disk tier within the same storage pool.

It also moves (demotes) extents whose activity dropped off, or cooled, from higher disk tier MDisks back to a lower tier MDisk. When EasyTier runs in a storage pool rebalance mode, it moves extents from busy MDisks to less busy MDisks of the same type.

10.2.2 Flash drive arrays and flash MDisks

The Flash drives or flash arrays are treated no differently by the IBM Storwize V7000 than normal HDDs arrays or MDisks. Individual Flash drives in the storage enclosures are combined into an array. As with HDD arrays, use the Distributed RAID 6 (DRAID6) format to reduce array rebuild times during an individual drive failure.

The internal storage configuration external flash arrays can differ depending on the storage vendor. Regardless of the methods used to configure flash-based storage, the external flash storage will map a volume to a host, in this case to the Storwize V7000. From the IBM Storwize V7000 perspective, a volume presented from external flash storage is also seen as a normal managed disk.

After creating an internal Flash drive array, it will appear as an MDisk with a tier of tier0_flash or an internal RI Flash drive array will have a tier of tier1_flash, which differs from MDisks presented from external storage systems. Because Spectrum Virtualize cannot know what kind of physical drives an external MDisks is formed of, the default MDisk tier that Storwize adds to each external MDisk is tier_enterprise. It is up to the user or administrator to change the tier of each external MDisk to tier0_flash, tier1_flash, tier_enterprise, or tier_nearline as appropriate for any external MDisk.

Note: It is possible to change the tier of MDisks made of internal Storwize drives even if the tier of MDisk does not fit the tier of physical drives that the MDisk is made of. Storwize knows the tier of drives it has in its disk enclosures, and selects the MDisk tier according to the drive tier. However, this selection can be overridden by a user or administrator. The only way to change the internal drive’s tier is using the command-line interface (CLI).

To change a tier of an MDisk in the CLI, use the chmdisk command as in Example 10-1.

Example 10-1 Changing MDisk tier

IBM_Storwize:ITSO_V7000G2:superuser>lsmdisk -delim " "

id name status mode mdisk_grp_id mdisk_grp_name capacity ctrl_LUN_# controller_name UID tier encrypt site_id site_name distributed dedupe

0 MDisk_01 online array 0 test_pool_1 2.7TB tier0_flash yes no

1 MDisk_02 online array 1 test_pool_2 2.4TB tier_enterprise yes no

2 mdisk0 online unmanaged 32.0GB 0000000000000000 controller0 600a0b80005ad22300000371527a29b200000000000000000000000000000000 tier_enterprise no no

3 mdisk1 online unmanaged 64.0GB 0000000000000001 controller0 600a0b80005ad22300000372527a29cf00000000000000000000000000000000 tier_enterprise no no

4 mdisk2 online unmanaged 64.0GB 0000000000000002 controller0 600a0b80005ad22300000373527a29ea00000000000000000000000000000000 tier_enterprise no no

5 mdisk3 online unmanaged 20.0GB 0000000000000003 controller0 600a0b80005ad223000004b952d3693e00000000000000000000000000000000 tier_enterprise no no

IBM_Storwize:ITSO_V7000G2:superuser>chmdisk -tier tier0_flash mdisk3

IBM_Storwize:ITSO_V7000G2:superuser>lsmdisk -delim " "

id name status mode mdisk_grp_id mdisk_grp_name capacity ctrl_LUN_# controller_name UID tier encrypt site_id site_name distributed dedupe

0 MDisk_01 online array 0 test_pool_1 2.7TB tier0_flash yes no

1 MDisk_02 online array 1 test_pool_2 2.4TB tier_enterprise yes no

2 mdisk0 online unmanaged 32.0GB 0000000000000000 controller0 600a0b80005ad22300000371527a29b200000000000000000000000000000000 tier_enterprise no no

3 mdisk1 online unmanaged 64.0GB 0000000000000001 controller0 600a0b80005ad22300000372527a29cf00000000000000000000000000000000 tier_enterprise no no

4 mdisk2 online unmanaged 64.0GB 0000000000000002 controller0 600a0b80005ad22300000373527a29ea00000000000000000000000000000000 tier_enterprise no no

5 mdisk3 online unmanaged 20.0GB 0000000000000003 controller0 600a0b80005ad223000004b952d3693e00000000000000000000000000000000 tier0_flash no no

It is also possible to change the MDisk tier from the graphical user interface (GUI), but this technique only applies to external MDisks. To change the tier, complete the following steps:

1. Click Pools → External Storage and click the expand sign (>) next to the controller that owns the MDisks for which you want to change the tier.

2. Then, right-click the target MDisk and select Modify Tier (Figure 10-3).

Figure 10-3 Change the MDisk tier

3. A new window opens with options to change the tier (Figure 10-4).

Figure 10-4 Selecting the MDisk tier

The tier change happens online and has no effect on hosts or volumes availability.

4. If you do not see the Tier column, click the symbol at the end of the title row and select the Tier check box, as shown in Figure 10-5.

Figure 10-5 Customizing the title row to show the tier column

10.2.3 Disk tiers

The internal or external MDisks (LUNs) are likely to have different performance attributes because of the type of disk or RAID array on which they are located. The MDisks can be created on any of the following hardware:

•15,000 RPM FC or SAS drives

•10,000 RPM FC or SAS drives

•Nearline SAS or SATA drives

•Flash drives or flash storage systems such as the IBM FlashSystem 900

•Read Intensive Flash drives

As mentioned in 10.2.2, “Flash drive arrays and flash MDisks” on page 392, Storwize V7000 does not automatically detect the type of external MDisks. Instead, all external MDisks initially are put into the enterprise tier by default. The administrator must then manually change the tier of MDisks and add them to storage pools. Depending on what type of disks are gathered to form a storage pool, the following types of storage pools are distinguished:

•Single-tier

•Multitier

Single-tier storage pools

Figure 10-6 shows a scenario in which a single storage pool is populated with MDisks that are presented by an external storage controller. In this solution, the striped volumes can be measured by EasyTier, and can benefit from Storage Pool Balancing mode, which moves extents between MDisks of the same type.

Figure 10-6 Single tier storage pool with striped volume

MDisks that are used in a single-tier storage pool should have the same hardware characteristics. For example, they should have the same RAID type, RAID array size, disk type, disk RPM, and controller performance characteristics.

Multitier storage pools

A multitier storage pool has a mix of MDisks with more than one type of disk tier attribute, for example, a storage pool that contains a mix of enterprise and Flash drive MDisks or enterprise and NL-SAS MDisks.

Figure 10-7 shows a scenario in which a storage pool is populated with three different
MDisk types (one belonging to a Flash drive array, one belonging to an SAS HDD array, and one belonging to an NL-SAS HDD array). Although this example shows RAID 5 arrays, other RAID types can be used as well.

Figure 10-7 Multitier storage pool with striped volume

Adding Flash drives to the pool also means that more space is now available for new volumes or volume expansion.

Note: Image mode and sequential volumes are not candidates for EasyTier automatic data placement, because all extents for those types of volumes must be on one specific MDisk, and cannot be moved.

The EasyTier setting can be changed on a storage pool and volume level. Depending on the EasyTier setting and the number of tiers in the storage pool, EasyTier services might function in a different way. Table 10-1 shows possible combinations of EasyTier settings.

Table 10-1 EasyTier settings

Storage pool EasyTier setting	Number of tiers in the storage pool	Volume copy EasyTier setting	Volume copy EasyTier status
Off	One	off	inactive (see note 2)
Off	One	on	inactive (see note 2)
Off	Two or three	off	inactive (see note 2)
Off	Two or three	on	inactive (see note 2)
Measure	One	off	measured (see note 3)
Measure	One	on	measured (see note 3)
Measure	Two or three	off	measured (see note 3)
Measure	Two or three	on	measured (see note 3)
Auto	One	off	measured (see note 3)
Auto	One	on	balanced (see note 4)
Auto	Two or three	off	measured (see note 3)
Auto	Two or three	on	active (see note 5)
On	One	off	measured (see note 3)
On	One	on	balanced (see note 4)
On	Two or three	off	measured (see note 3)
On	Two or three	on	active (see note 5)

Table notes:

1. If the volume copy is in image or sequential mode, or is being migrated, the volume copy EasyTier status is measured rather than active.

2. When the volume copy status is inactive, no EasyTier functions are enabled for that volume copy.

3. When the volume copy status is measured, the EasyTier function collects usage statistics for the volume, but automatic data placement is not active.

4. When the volume copy status is balanced, the EasyTier function enables performance-based pool balancing for that volume copy.

5. When the volume copy status is active, the EasyTier function operates in automatic data placement mode for that volume.

6. The default EasyTier setting for a storage pool is Auto, and the default EasyTier setting for a volume copy is On. Therefore, EasyTier functions, except pool performance balancing, are disabled for storage pools with a single tier. Automatic data placement mode is enabled by default for all striped volume copies in a storage pool with two or more tiers.

Table 10-2 shows the naming convention and all supported combinations of storage tiering used by EasyTier.

Table 10-2 EasyTier supported storage pools

Tier 0	Tier 1	Tier 2
Three Tier Pool:
tier0_flash	tier_enterprise	tier_nearline
tier0_flash	tier1_flash	tier_enterprise
tier0_flash	tier1_flash	tier_nearline
tier1_flash	tier_enterprise	tier_nearline
Two Tier Pool:
tier0_flash	tier1_flash
tier0_flash	tier_enterprise
tier0_flash	tier_nearline
	tier1_flash	tier_enterprise
	tier1_flash	tier_nearline
	tier_enterprise	tier_nearline
Single Tier Pool:
tier0_flash
	tier1_flash
	tier_enterprise
		tier_nearline

10.2.4 Read Intensive Flash drives and EasyTier

One of the reasons why flash technology is still quite expensive when compared to traditional HDD is that an over provisioning of the physical memory is provided to mitigate the Write Amplification issue (https://en.wikipedia.org/wiki/Write_amplification). Read-Intensive (RI) Flash drives are lower-cost Flash drives with the cost reduction being achieved by having less redundant flash material.

Read Intensive Flash drive support for Spectrum Virtualize/Storwize systems has been initially introduced with V7.7 and then enhanced with V7.8 introducing, among other things, EasyTier support for RI MDisks.

Even though EasyTier still remains a three tier storage architecture, 7.8 added a new tier specifically for the RI MDisks. From a user perspective, there are now four tiers:

•T0 or tier0_flash that represents the enterprise flash technology

•T1 or tier1_flash that represents the RI flash drive technology

•T2 or tier2_enterprise that represents the enterprise HDD technology

•T3 or tier3_nearline that represents the nearline HDD technology

These user tiers are mapped to EasyTier tiers depending on the pool configuration. The table in Figure 10-8 shows the possible combinations for the pool configuration with regard to the four user tiers (the configurations containing the RI user tier are highlighted in orange).

Figure 10-8 EasyTier mapping policy

The table columns represent all the possible pool configurations, whereas the rows reports in which EasyTier tier each user tier is mapped. For example, consider a pool with all the possible tiers configured that corresponds with the T0+T1+T2+T3 configuration in the table. With this configuration, T1 and T2 are mapped to the same EasyTier tier (tier 2). Note that the Tier1_flash tier is only mapped to EasyTier 1 or 2 tier.

For more information about planning and configuration considerations or best practices, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521, and Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072.

For more information about RI Flash drives, see Read Intensive Flash Drives, REDP-5380.

10.2.5 EasyTier process

The EasyTier function includes the following four main processes:

•I/O Monitoring

This process operates continuously and monitors volumes for host I/O activity. It collects performance statistics for each extent, and derives averages for a rolling 24-hour period of I/O activity.

EasyTier makes allowances for large block I/Os. Therefore, it considers only I/Os of up to 64 kibibytes (KiB) as migration candidates.

This process is efficient and adds negligible processing resource use to the IBM SAN Volume Controller nodes.

•Data Placement Advisor

The Data Placement Advisor uses workload statistics to make a cost benefit decision as to which extents are to be candidates for migration to a higher performance tier.

This process also identifies extents that must be migrated back to a lower tier.

•Data Migration Planner (DMP)

By using the extents that were previously identified, the DMP builds the extent migration plans for the storage pool. The DMP builds two plans:

– Automatic Data Relocation (ADR mode) plan to migrate extents across adjacent tiers

– Rebalance (RB mode) plan to migrate extents within the same tier

•Data Migrator

This process involves the actual movement or migration of the volume’s extents up to, or down from, the higher disk tier. The extent migration rate is capped so that a maximum of up to 30 megabytes per second (MBps) is migrated, which equates to approximately 3 terabytes (TB) per day that is migrated between disk tiers.

When enabled, EasyTier performs the following actions between three tiers presented in Figure 10-9 on page 400:

•Promote

Moves the relevant hot extents to higher performing tier.

•Swap

Exchanges cold extent in upper tier with hot extent in lower tier.

•Warm Demote

– Prevents performance overload of a tier by demoting a warm extent to the lower tier.

– Triggered when bandwidth or IOPS exceeds predefined threshold.

•Demote or Cold Demote

Coldest data is moved to lower tier. Only supported between HDD tiers.

•Expanded Cold Demote

Demotes appropriate sequential workloads to the lowest tier to better use nearline disk bandwidth.

•Storage Pool Balancing

– Redistributes extents within a tier to balance usage across MDisks for maximum performance.

– Moves hot extents from high usage MDisks to low usage MDisks.

– Exchanges extents between high usage MDisks and low usage MDisks.

•EasyTier attempts to migrate the most active volume extents up to flash tier first.

•A previous migration plan and any queued extents that are not yet relocated are abandoned.

Note: Extent promotion / demotion only occurs between adjacent tiers. In a three-tiered storage pool, EasyTier will not move extents from a flash tier directly to nearline tier or vice versa without moving to enterprise tier first.

EasyTier extent migration types are presented in Figure 10-9.

Figure 10-9 EasyTier extent migration types

10.2.6 EasyTier operating modes

EasyTier includes the following main operating modes:

•Off

•Evaluation or measurement only

•Automatic data placement or extent migration

•Storage pool balancing

EasyTier off mode

With EasyTier turned off, no statistics are recorded, and no cross-tier extent migration occurs.

Evaluation or measurement only mode

EasyTier Evaluation or measurement-only mode collects usage statistics for each extent in a single-tier storage pool where the EasyTier value is set to On for both the volume and the pool. This collection is typically done for a single-tier pool that contains only HDDs so that the benefits of adding Flash drives to the pool can be evaluated before any major hardware acquisition.

A dpa_heat.nodeid.yymmdd.hhmmss.data statistics summary file is created in the /dumps directory of the Storwize V7000 node canisters. This file can be offloaded from the Storwize node canisters with PuTTY Secure Copy Client (PSCP) -load command or by using the GUI, as described in IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521. A web browser is used to view the report that is created by the tool.

Automatic Data Placement or extent migration mode

In Automatic data placement or extent migration operating mode, the storage pool parameter -easytier on or auto must be set, and the volumes in the pool must have -easytier on. The storage pool must also contain MDisks with different disk tiers, which makes it a multitier storage pool.

Dynamic data movement is not apparent to the host server and application users of the data, other than providing improved performance. Extents are automatically migrated, as explained in “Implementation rules” on page 402.

The statistic summary file is also created in this mode. This file can be offloaded for input to the advisor tool. The tool produces a report on the extents that are moved to a higher tier, and a prediction of performance improvement that can be gained if more higher tier disks are available.

Options: The EasyTier function can be turned on or off at the storage pool level and at the volume level.

Storage Pool Balancing

Although storage pool balancing is associated with EasyTier, it operates independently of EasyTier, and does not require an EasyTier license. This feature assesses the extents that are written in a pool, and balances them automatically across all MDisks within the pool. This process works along with EasyTier when multiple classes of disks exist in a single pool. In such a case, EasyTier moves extents between the different tiers, and storage pool balancing moves extents within the same tier, to better use MDisks.

The process automatically balances existing data when new MDisks are added into an existing pool, even if the pool only contains a single type of drive. However, the process does not migrate extents from existing MDisks to achieve even extent distribution among all, old and new, MDisks in the storage pool. EasyTier rebalancing process within a tier migration plan is based on performance, not on the capacity of underlying MDisks.

Note: Storage pool balancing can be used to balance extents when mixing different size disks of the same performance tier. For example, when adding larger capacity drives to a pool with smaller capacity drives of the same class, Storage pool balancing redistributes the extents to take advantage of the additional performance of the new MDisks.

10.2.7 Implementation considerations

EasyTier is a licensed feature, except for storage pool balancing, which is a no-charge feature that is enabled by default. EasyTier comes as part of the IBM Spectrum Virtualize code. For EasyTier to migrate extents between different tier disks, you must have disk storage available that offers different tiers (for example, a mix of Flash drive and HDD). EasyTier performs storage pool balancing even if you have only single tier pool.

Implementation rules

Remember the following implementation and operational rules when you use the IBM System Storage EasyTier function on the IBM Storwize V7000:

•EasyTier automatic data placement is not supported on image mode or sequential volumes. I/O monitoring for such volumes is supported, but you cannot migrate extents on these volumes unless you convert image or sequential volume copies to striped volumes.

•Automatic data placement and extent I/O activity monitors are supported on each copy of a mirrored volume. EasyTier works with each copy independently of the other copy.

Volume mirroring consideration: Volume mirroring can have different workload characteristics on each copy of the data because reads are normally directed to the primary copy and writes occur to both copies. Therefore, the number of extents that EasyTier migrates between the tiers might be different for each copy.

•If possible, the IBM Storwize V7000 creates volumes or expands volumes by using extents from MDisks from an HDD tier (tier_enterprise or tier_nearline). However, if necessary, it uses extents from MDisks from a Flash drive tier (tier0_flash or tier1_flash).

When a volume is migrated out of a storage pool that is managed with EasyTier, EasyTier automatic data placement mode is no longer active on that volume. Automatic data placement is also turned off while a volume is being migrated, even when it is between pools that both have EasyTier automatic data placement enabled. Automatic data placement for the volume is reenabled when the migration is complete.

Limitations

When you use EasyTier on the IBM Storwize V7000, remember the following limitations:

•Removing an MDisk by using the -force parameter

When an MDisk is deleted from a storage pool with the -force parameter, extents in use are migrated to MDisks in the same tier as the MDisk that is being removed, if possible. If insufficient extents exist in that tier, extents from the other tier are used.

•Migrating extents

When EasyTier automatic data placement is enabled for a volume, you cannot use the svctask migrateexts CLI command on that volume.

•Migrating a volume to another storage pool

When IBM Storwize V7000 migrates a volume to a new storage pool, EasyTier automatic data placement between the two tiers is temporarily suspended. After the volume is migrated to its new storage pool, EasyTier automatic data placement between the tier0_flash tier and the tier_enterprise resumes for the moved volume, if appropriate.

When the IBM Storwize V7000 migrates a volume from one storage pool to another, it attempts to migrate each extent to an extent in the new storage pool that is the same tier as the original extent. In several cases, such as where a target tier is unavailable, another tier is used. For example, the tier0_flash tier might be unavailable in the new storage pool.

•Migrating a volume to an image mode

EasyTier automatic data placement does not support image mode. When a volume with active EasyTier automatic data placement mode is migrated to an image mode, EasyTier automatic data placement mode is no longer active on that volume.

•Image mode and sequential volumes cannot be candidates for automatic data placement. However, EasyTier supports evaluation mode for image mode volumes.

10.2.8 Modifying the EasyTier setting

The EasyTier setting for storage pools and volumes can only be changed from the command-line interface. All of the changes are done online without any effect on hosts or data availability.

Turning EasyTier on and off

Use the chvdisk command to turn off or turn on EasyTier on selected volumes. Use the chmdiskgrp to change status of EasyTier on selected storage pools as shown in Example 10-2.

Example 10-2 Changing the EasyTier setting

IBM_Storwize:ITSO_V7000G2_A:superuser>lsvdisk test_vol_2

id 11

name test_vol_2

IO_group_id 0

IO_group_name io_grp0

status online

mdisk_grp_id 0

mdisk_grp_name test_pool_1

capacity 5.00GB

type striped

formatted no

formatting yes

mdisk_id

mdisk_name

FC_id

FC_name

RC_id

RC_name

vdisk_UID 600507680283818B300000000000000D

throttling 0

preferred_node_id 1

fast_write_state not_empty

cache readwrite

udid

fc_map_count 0

sync_rate 50

copy_count 1

se_copy_count 0

filesystem

mirror_write_priority latency

RC_change no

compressed_copy_count 0

access_IO_group_count 1

last_access_time

parent_mdisk_grp_id 0

parent_mdisk_grp_name test_pool_1

owner_type none

owner_id

owner_name

encrypt no

volume_id 11

volume_name test_vol_2

function

copy_id 0

status online

sync yes

auto_delete no

primary yes

mdisk_grp_id 0

mdisk_grp_name test_pool_1

type striped

mdisk_id

mdisk_name

fast_write_state not_empty

used_capacity 5.00GB

real_capacity 5.00GB

free_capacity 0.00MB

overallocation 100

autoexpand

warning

grainsize

se_copy no

easy_tier off

easy_tier_status measured

tier tier0_flash

tier_capacity 5.00GB

tier tier1_flash

tier_capacity 0.00GB

tier tier_enterprise

tier_capacity 0.00MB

tier tier_nearline

tier_capacity 0.00MB

compressed_copy no

uncompressed_used_capacity 5.00GB

parent_mdisk_grp_id 0

parent_mdisk_grp_name test_pool_1

encrypt no

IBM_Storwize:ITSO_V7000G2_A:superuser>chvdisk -easytier on test_vol_2

IBM_Storwize:ITSO_V7000G2_A:superuser>lsvdisk test_vol_2

id 11

name test_vol_2

IO_group_id 0

IO_group_name io_grp0

status online

mdisk_grp_id 0

mdisk_grp_name test_pool_1

capacity 5.00GB

type striped

formatted no

formatting yes

mdisk_id

mdisk_name

FC_id

FC_name

RC_id

RC_name

vdisk_UID 600507680283818B300000000000000D

throttling 0

preferred_node_id 1

fast_write_state not_empty

cache readwrite

udid

fc_map_count 0

sync_rate 50

copy_count 1

se_copy_count 0

filesystem

mirror_write_priority latency

RC_change no

compressed_copy_count 0

access_IO_group_count 1

last_access_time

parent_mdisk_grp_id 0

parent_mdisk_grp_name test_pool_1

owner_type none

owner_id

owner_name

encrypt no

volume_id 11

volume_name test_vol_2

function

copy_id 0

status online

sync yes

auto_delete no

primary yes

mdisk_grp_id 0

mdisk_grp_name test_pool_1

type striped

mdisk_id

mdisk_name

fast_write_state not_empty

used_capacity 5.00GB

real_capacity 5.00GB

free_capacity 0.00MB

overallocation 100

autoexpand

warning

grainsize

se_copy no

easy_tier on

easy_tier_status balanced

tier tier0_flash

tier_capacity 5.00GB

tier tier0_flash

tier_capacity 0.00GB

tier tier_enterprise

tier_capacity 0.00MB

tier tier_nearline

tier_capacity 0.00MB

compressed_copy no

uncompressed_used_capacity 5.00GB

parent_mdisk_grp_id 0

parent_mdisk_grp_name test_pool_1

encrypt no

IBM_Storwize:ITSO_V7000G2_A:superuser>lsmdiskgrp test_pool_1

id 0

name test_pool_1

status online

mdisk_count 1

vdisk_count 10

capacity 2.70TB

extent_size 1024

free_capacity 1.52TB

virtual_capacity 185.00GB

used_capacity 185.00GB

real_capacity 185.00GB

overallocation 6

warning 80

easy_tier auto

easy_tier_status balanced

tier tier0_flash

tier_mdisk_count 1

tier_capacity 2.70TB

tier_free_capacity 2.52TB

tier tier1_flash

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_nearline

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

compression_active no

compression_virtual_capacity 0.00MB

compression_compressed_capacity 0.00MB

compression_uncompressed_capacity 0.00MB

site_id

site_name

parent_mdisk_grp_id 0

parent_mdisk_grp_name test_pool_1

child_mdisk_grp_count 1

child_mdisk_grp_capacity 1.00TB

type parent

encrypt no

owner_type none

owner_id

owner_name

data_reduction no

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

reclaimable_capacity 0.00MB

IBM_Storwize:ITSO_V7000G2_A:superuser>chmdiskgrp -easytier off test_pool_1

IBM_Storwize:ITSO_V7000G2_A:superuser>lsmdiskgrp test_pool_1

id 0

name test_pool_1

status online

mdisk_count 1

vdisk_count 10

capacity 2.70TB

extent_size 1024

free_capacity 1.52TB

virtual_capacity 185.00GB

used_capacity 185.00GB

real_capacity 185.00GB

overallocation 6

warning 80

easy_tier off

easy_tier_status inactive

tier tier0_flash

tier_mdisk_count 1

tier_capacity 2.70TB

tier_free_capacity 2.52TB

tier tier1_flash

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_nearline

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

compression_active no

compression_virtual_capacity 0.00MB

compression_compressed_capacity 0.00MB

compression_uncompressed_capacity 0.00MB

site_id

site_name

parent_mdisk_grp_id 0

parent_mdisk_grp_name test_pool_1

child_mdisk_grp_count 1

child_mdisk_grp_capacity 1.00TB

type parent

encrypt no

owner_type none

owner_id

owner_name

data_reduction no

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

reclaimable_capacity 0.00MB

Tuning EasyTier

It is also possible to change more advanced parameters of EasyTier. Adjust these parameters with caution because changing the default values can affect system performance.

EasyTier acceleration

EasyTier acceleration is a system-wide setting that is disabled by default. Turning on this setting makes EasyTier move extents up to four times faster than when in default setting. In accelerate mode, EasyTier can move up to 48 GiB per 5 minutes. In normal mode, it moves up to 12 GiB. Enabling EasyTier acceleration is advised only during periods of low system activity. The following use cases for acceleration are the most probable:

•When adding capacity to the pool, accelerating EasyTier can quickly spread existing volumes onto the new MDisks.

•When migrating the volumes between the storage pools in cases where the target storage pool has more tiers than the source storage pool, accelerating EasyTier can quickly promote or demote extents in the target pool.

This setting can be changed online, without any effect on host or data availability. To turn EasyTier acceleration mode on or off, use the chsystem command, as shown in Example 10-3.

Example 10-3 The chsystem command

IBM_Storwize:ITSO_V7000G2_A:superuser>lssystem

id 000001002140020E

name ITSO_V7000G2_A

location local

partnership

total_mdisk_capacity 1.5TB

space_in_mdisk_grps 1.1TB

space_allocated_to_vdisks 18.00MB

total_free_space 1.5TB

total_vdiskcopy_capacity 2.00GB

total_used_capacity 0.16MB

total_overallocation 0

total_vdisk_capacity 2.00GB

total_allocated_extent_capacity 1.00GB

statistics_status on

statistics_frequency 15

cluster_locale en_US

time_zone 520 US/Pacific

code_level 8.1.0.0 (build 137.4.1709191910000)

console_IP 10.18.228.70:443

id_alias 000001002140020E

gm_link_tolerance 300

gm_inter_cluster_delay_simulation 0

gm_intra_cluster_delay_simulation 0

gm_max_host_delay 5

email_reply [email protected]

email_contact ITSO

email_contact_primary 123456789

email_contact_alternate

email_contact_location

email_contact2

email_contact2_primary

email_contact2_alternate

email_state running

inventory_mail_interval 7

cluster_ntp_IP_address

cluster_isns_IP_address

iscsi_auth_method none

iscsi_chap_secret 1111

auth_service_configured no

auth_service_enabled no

auth_service_url

auth_service_user_name

auth_service_pwd_set no

auth_service_cert_set no

auth_service_type tip

relationship_bandwidth_limit 25

tier tier0_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier1_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_capacity 1.08TB

tier_free_capacity 1.08TB

tier tier_nearline

tier_capacity 0.00MB

easy_tier_acceleration off

has_nas_key no

layer storage

rc_buffer_size 48

compression_active yes

compression_virtual_capacity 2.00GB

compression_compressed_capacity 0.16MB

compression_uncompressed_capacity 0.00MB

cache_prefetch on

email_organization

email_machine_address

email_machine_city

email_machine_state XX

email_machine_zip

email_machine_country

total_drive_raw_capacity 4.91TB

compression_destage_mode off

local_fc_port_mask 0000000000000000000000000000000000000000000000000000000000000111

partner_fc_port_mask 0000000000000000000000000000000000000000000000000000000000001000

high_temp_mode off

topology standard

topology_status

rc_auth_method chap

vdisk_protection_time 15

vdisk_protection_enabled no

product_name IBM Storwize V7000

odx off

max_replication_delay 0

partnership_exclusion_threshold 315

gen1_compatibility_mode_enabled no

ibm_customer

ibm_component

ibm_country

tier0_flash_compressed_data_used 0.00MB

tier1_flash_compressed_data_used 0.00MB

tier_enterprise_compressed_data_used 0.00MB

tier_nearline_compressed_data_used 0.00MB

total_reclaimable_capacity 0.00MB

unmap on

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

IBM_Storwize:ITSO_V7000G2_A:superuser>chsystem -easytieracceleration on

IBM_Storwize:ITSO_V7000G2_A:superuser>lssystem

id 000001002140020E

name ITSO_V7000G2_A

location local

partnership

total_mdisk_capacity 1.5TB

space_in_mdisk_grps 1.1TB

space_allocated_to_vdisks 18.00MB

total_free_space 1.5TB

total_vdiskcopy_capacity 2.00GB

total_used_capacity 0.16MB

total_overallocation 0

total_vdisk_capacity 2.00GB

total_allocated_extent_capacity 1.00GB

statistics_status on

statistics_frequency 15

cluster_locale en_US

time_zone 520 US/Pacific

code_level 8.1.0.0 (build 137.4.1709191910000)

console_IP 10.18.228.70:443

id_alias 000001002140020E

gm_link_tolerance 300

gm_inter_cluster_delay_simulation 0

gm_intra_cluster_delay_simulation 0

gm_max_host_delay 5

email_reply [email protected]

email_contact ITSO

email_contact_primary 123456789

email_contact_alternate

email_contact_location

email_contact2

email_contact2_primary

email_contact2_alternate

email_state running

inventory_mail_interval 7

cluster_ntp_IP_address

cluster_isns_IP_address

iscsi_auth_method none

iscsi_chap_secret 1111

auth_service_configured no

auth_service_enabled no

auth_service_url

auth_service_user_name

auth_service_pwd_set no

auth_service_cert_set no

auth_service_type tip

relationship_bandwidth_limit 25

tier tier0_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier1_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_capacity 1.08TB

tier_free_capacity 1.08TB

tier tier_nearline

tier_capacity 0.00MB

tier_free_capacity 0.00MB

easy_tier_acceleration on

has_nas_key no

layer storage

rc_buffer_size 48

compression_active yes

compression_virtual_capacity 2.00GB

compression_compressed_capacity 0.16MB

compression_uncompressed_capacity 0.00MB

cache_prefetch on

email_organization

email_machine_address

email_machine_city

email_machine_state XX

email_machine_zip

email_machine_country

total_drive_raw_capacity 4.91TB

compression_destage_mode off

local_fc_port_mask 0000000000000000000000000000000000000000000000000000000000000111

partner_fc_port_mask 0000000000000000000000000000000000000000000000000000000000001000

high_temp_mode off

topology standard

topology_status

rc_auth_method chap

vdisk_protection_time 15

vdisk_protection_enabled no

product_name IBM Storwize V7000

odx off

max_replication_delay 0

partnership_exclusion_threshold 315

gen1_compatibility_mode_enabled no

ibm_customer

ibm_component

ibm_country

tier0_flash_compressed_data_used 0.00MB

tier1_flash_compressed_data_used 0.00MB

tier_enterprise_compressed_data_used 0.00MB

tier_nearline_compressed_data_used 0.00MB

total_reclaimable_capacity 0.00MB

unmap on

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

MDisk EasyTier load

The second setting is called MDisk EasyTier load. This setting is set per MDisk basis, and indicates how much load EasyTier can put on the particular MDisk. The following different values can be set to each MDisk:

•Default

•Low

•Medium

•High

•Very high

The system uses a default setting based on the storage tier of the presented MDisks, either flash, nearline, or enterprise. If the disk drives are internal, the tier is known. However, an external MDisk tier should be changed by the user to align it with underlying storage.

Change the default setting to any other value only when you are certain that a particular MDisk is underutilized and can handle more load, or that the MDisk is overutilized and the load should be lowered. Change this setting to very high only for SDD and flash MDisks.

This setting can be changed online, without any effect on the hosts or data availability. To change this setting, use the chmdisk command, as shown in Example 10-4.

Example 10-4 The chmdisk command

IBM_Storwize:ITSO_V7000G2_A:superuser>lsmdisk mdisk0

id 2

name mdisk0

status online

mode array

mdisk_grp_id

mdisk_grp_name

capacity 19.0TB

quorum_index

block_size

controller_name

ctrl_type

ctrl_WWNN

controller_id

path_count

max_path_count

ctrl_LUN_#

UID

preferred_WWPN

active_WWPN

fast_write_state not_empty

raid_status online

raid_level raid6

redundancy 2

strip_size 256

spare_goal

spare_protection_min

balanced exact

tier tier_enterprise

slow_write_priority latency

fabric_type

site_id

site_name

easy_tier_load medium

encrypt no

distributed yes

drive_class_id 0

drive_count 24

stripe_width 12

rebuild_areas_total 2

rebuild_areas_available 2

rebuild_areas_goal 2

dedupe no

preferred_iscsi_port_id

active_iscsi_port_id

replacement_date

IBM_Storwize:ITSO_V7000G2_A:superuser>chmdisk -easytierload high mdisk0

IBM_Storwize:ITSO_V7000G2_A:superuser>lsmdisk mdisk0

id 2

name mdisk0

status online

mode array

mdisk_grp_id

mdisk_grp_name

capacity 19.0TB

quorum_index

block_size

controller_name

ctrl_type

ctrl_WWNN

controller_id

path_count

max_path_count

ctrl_LUN_#

UID

preferred_WWPN

active_WWPN

fast_write_state not_empty

raid_status online

raid_level raid6

redundancy 2

strip_size 256

spare_goal

spare_protection_min

balanced exact

tier tier_enterprise

slow_write_priority latency

fabric_type

site_id

site_name

easy_tier_load high

encrypt no

distributed yes

drive_class_id 0

drive_count 24

stripe_width 12

rebuild_areas_total 2

rebuild_areas_available 2

rebuild_areas_goal 2

dedupe no

preferred_iscsi_port_id

active_iscsi_port_id

replacement_date

10.2.9 Monitoring tools

The IBM Storage Tier Advisor Tool (STAT) is an Microsoft Windows application that analyzes heatmap data files produced by EasyTier. STAT creates a graphical display of the amount of “hot” data per volume. It predicts, by storage pool, how adding more Flash drives, enterprise drives, or nearline drives capacity might improve system performance by storage pool.

The IBM STAT can be downloaded from:

http://www.ibm.com/support/docview.wss?uid=ssg1S4000935

Heat data files are produced approximately once a day (that is, every 24 hours) when EasyTier is active on one or more storage pools. They summarize the activity per volume since the prior heat data file was produced. On the Storwize V7000, the heat data file is in the /dumps/easytier directory on the configuration node and named dpa_heat.<node_name>.<time_stamp>.data. Existing heat data file are erased after seven days.

To download a heat data file, open Settings → Support → Support Package and click the twistie to open the Manual Upload Instructions, then click the Download Support Package button, as shown in Figure 10-10.

Figure 10-10 Download EasyTier heat file: Download Support Package

From the Download New Support Package or Log File window, click Download Existing Package, as shown in Figure 10-11.

Figure 10-11 Downloading EasyTier heat data file: Download Existing Package

You can filter for heat files and select the most recent heat data file from the window shown in Figure 10-12. Click Download and save the file wherever you wish. In our example, we save the file on our workstation to the STATinput_files directory.

Figure 10-12 Downloading EasyTier heat data file: Filter, select, and download

STAT must be started from a Windows command prompt with the file specified as a parameter, as shown in Example 10-5.

Example 10-5 Running STAT in Windows command prompt

C:Program Files (x86)IBMSTAT>stat input_filesdpa_heat.KD8P1BP.171018.095715.data

You can also specify the output directory if you want. STAT creates a set of Hypertext Markup Language (HTML) files, and the user can then open the STATindex.html file in a browser to view the results. Additionally, three comma-separated values (CSV) files are created and placed in the Data_files directory.

Figure 10-13 shows the CSV files highlighted in the Data_files directory after running the STAT tool on the Storwize V7000 heatmap.

Figure 10-13 CSV files created by the STAT for EasyTier

In addition to the STAT tool, another utility is available that is a Microsoft Excel file for creating additional graphical reports of the workload that EasyTier performs. The IBM STAT Charting Utility takes the previous three CSV output files and turns them into graphs for simple reporting.

The STAT Charting Utility can be downloaded from the IBM Support website:

http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5251

The new graphs display the following information:

•Workload Categorization

New workload visuals help you compare activity across tiers within and across pools to help determine the optimal drive mix for the current workloads. The output is illustrated in Figure 10-14.

Figure 10-14 STAT Charting Utility Workload Categorization report

•Daily Movement report

A new EasyTier summary report every 24 hours illustrating data migration activity (5-minute intervals) can help visualize migration types and patterns for current workloads. The output is illustrated in Figure 10-15.

Figure 10-15 STAT Charting Utility Daily Summary report

•Workload Skew report

This report shows the skew of all workloads across the system in a graph to help you visualize and accurately tier configurations when you add capacity or a new system. The output is illustrated in Figure 10-16.

Figure 10-16 STAT Charting Utility Workload Skew report

10.2.10 More information

For more information about planning and configuration considerations, best practices, and monitoring and measurement tools, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521, and Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072.

10.3 Thin provisioning

In a shared storage environment, thin provisioning is a method for optimizing the usage of available storage. It relies on allocating blocks of data on demand versus the traditional method of allocating all of the blocks up front. This method eliminates almost all white space, which helps avoid the poor usage rates (often as low as 10%) that occur in the traditional storage allocation method. Traditionally, large pools of storage capacity are allocated to individual servers but remain unused (not written to).

Thin provisioning presents more storage space to the hosts or servers that are connected to the storage system than is available on the storage system. The IBM Storwize V7000 supports this capability for Fibre Channel (FC) and Internet Small Computer System Interface (iSCSI) provisioned volumes.

An example of thin provisioning is when a storage system contains 5000 GiB of usable storage capacity, but the storage administrator mapped volumes of 500 GiB each to 15 hosts. In this example, the storage administrator makes 7500 GiB of storage space visible to the hosts, even though the storage system has only 5000 GiB of usable space, as shown in Figure 10-17. In this case, all 15 hosts cannot immediately use all 500 GiB that is provisioned to them. The storage administrator must monitor the system and add storage as needed.

Figure 10-17 Concept of thin provisioning

You can imagine thin provisioning as the same process as when airlines sell more tickets on a flight than physical seats are available, assuming that some passengers do not appear at check-in. They do not assign actual seats at the time of sale, which avoids each client having a claim on a specific seat number. The same concept applies to thin provisioning (the airline), IBM Storwize V7000 (the plane), and its volumes (seats). The storage administrator (airline ticketing system) must closely monitor the allocation process and set proper thresholds.

10.3.1 Configuring a thin-provisioned volume

Volumes can be configured as thin-provisioned or fully allocated. Thin-provisioned volumes are created with real and virtual capacities. You can still create volumes by using a striped, sequential, or image mode virtualization policy, as you can with any other volume.

Real capacity defines how much disk space is allocated to a volume. Virtual capacity is the capacity of the volume that is reported to other IBM Storwize V7000 components (such as FlashCopy or remote copy) and to the hosts. For example, you can create a volume with real capacity of only 100 GiB, but virtual capacity of 1 tebibyte (TiB). The actual space used by the volume on IBM Storwize V7000 is 100 GiB, but hosts see a 1 TiB volume.

A directory maps the virtual address space to the real address space. The directory and the user data share the real capacity.

Thin-provisioned volumes are available in two operating modes: Autoexpand and non-autoexpand. You can switch the mode at any time. If you select the autoexpand feature, the IBM Storwize V7000 automatically adds a fixed amount of more real capacity to the thin volume as required. Therefore, the autoexpand feature attempts to maintain a fixed amount of unused real capacity for the volume. This amount is known as the contingency capacity.

The contingency capacity is initially set to the real capacity that is assigned when the volume is created. If the user modifies the real capacity, the contingency capacity is reset to be the difference between the used capacity and real capacity.

A volume that is created without the autoexpand feature, and therefore has a zero contingency capacity, goes offline when the real capacity is used and the volume must expand.

Warning threshold: Enable the warning threshold, by using email or a Simple Network Management Protocol (SNMP) trap, when you work with thin-provisioned volumes. You can enable the warning threshold on the volume, and on the storage pool side, especially when you do not use the autoexpand mode. Otherwise, the thin volume goes offline if it runs out of space.

Autoexpand mode does not cause real capacity to grow much beyond the virtual capacity. The real capacity can be manually expanded to more than the maximum that is required by the current virtual capacity, and the contingency capacity is recalculated.

A thin-provisioned volume can be converted non-disruptively to a fully allocated volume, or vice versa, by using the volume mirroring function. For example, you can add a thin-provisioned copy to a fully allocated primary volume, and then remove the fully allocated copy from the volume after they are synchronized.

The fully allocated to thin-provisioned migration procedure uses a zero-detection algorithm, so that grains that contain all zeros do not cause any real capacity to be used. Usually, if IBM Storwize V7000 is to detect zeros on the volume, you must use software on the host side to write zeros to all unused space on the disk or file system.

Tip: Consider the use of thin-provisioned volumes as targets in the FlashCopy mappings.

Space allocation

When a thin-provisioned volume is created, a small amount of the real capacity is used for initial metadata. Write I/Os to the grains of the thin volume (that were not previously written to) cause grains of the real capacity to be used to store metadata and user data. Write I/Os to the grains (that were previously written to) update the grain where data was previously written.

The grain is defined when the volume is created, and can be 32 KiB, 64 KiB, 128 KiB, or 256 KiB.

Smaller granularities can save more space, but they have larger directories. When you use thin-provisioning with FlashCopy, specify the same grain size for the thin-provisioned volume and FlashCopy.

For information about creating a thin-provisioned volume, see Chapter 7, “Volumes” on page 239.

10.3.2 Performance considerations

Thin-provisioned volumes save capacity only if the host server does not write to whole volumes. Whether the thin-provisioned volume works well partly depends on how the file system allocated the space. Some file systems, for example, New Technology File System (NTFS), write to the whole volume before overwriting deleted files. Other file systems reuse space in preference to allocating new space.

File system problems can be moderated by tools, such as defrag, or by managing storage by using host Logical Volume Managers (LVMs). The thin-provisioned volume also depends on how applications use the file system. For example, some applications delete log files only when the file system is nearly full.

Important: Do not use defrag on thin-provisioned volumes. The defragmentation process can write data to different areas of a volume, which can cause a thin-provisioned volume to grow up to its virtual size.

There is no recommendation for thin-provisioned volumes. As explained previously, the performance of thin-provisioned volumes depends on what is used in the particular environment. For the best performance, use fully allocated volumes rather than thin-provisioned volumes.

Starting with IBM Spectrum Virtualize V7.3, the cache subsystem architecture was redesigned. Now, thin-provisioned volumes can benefit from lower cache functions (such as coalescing writes or prefetching), which greatly improve performance.

10.3.3 Limitations of virtual capacity

A few factors (extent and grain size) limit the virtual capacity of thin-provisioned volumes beyond the factors that limit the capacity of regular volumes. Table 10-3 shows the maximum thin provisioned volume virtual capacities for an extent size.

Table 10-3 Maximum thin provisioned volume virtual capacities for an extent size

Extent size in megabytes (MB)	Maximum volume real capacity in gigabytes (GB)	Maximum thin virtual capacity in GB
0,016	002,048	002,000
0,032	004,096	004,000
0,064	008,192	008,000
0,128	016,384	016,000
0,256	032,768	032,000
0,512	065,536	065,000
1,024	131,072	130,000
2,048	262,144	260,000
4,096	262,144	262,144
8,192	262,144	262,144

Table 10-4 shows the maximum thin-provisioned volume virtual capacities for a grain size.

Table 10-4 Maximum thin volume virtual capacities for a grain size

Grain size in KiB	Maximum thin virtual capacity in GiB
032	0,260,000
064	0,520,000
128	1,040,000
256	2,080,000

For more information and detailed performance considerations for configuring thin provisioning, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521. You can also go to IBM Knowledge Center:

https://ibm.biz/BdjGM9

10.4 Unmap

There is an industry trend toward host operating systems having more control over the storage that it is using. VMWare's VAAI/VASA/VVOLs and Microsoft's ODX are examples of such technologies. These technologies allow host operating systems to manage data (for example, provision, move data around) on the controller without a storage administrator needing to do anything on the storage controller. This change also means that a user on the host operating system does not need to know anything about the underlying technologies.

10.4.1 SCSI unmap command

Unmap is a set of SCSI primitives that allow hosts to indicate to a SCSI target that space allocated to a range of blocks on a target storage volume is no longer required. This command allows the storage controller to take measures and optimize the system so that the space can be reused for other purposes. The most common use case, for example, is a host application such as VMware freeing storage within a file system. The storage controller can then optimize the space, such as reorganize the data on the volume so that space is better used.

When a host allocates storage, the data is placed in a volume. To free the allocated space back to the storage pools, human intervention is needed on the storage controller. The SCSI Unmap feature is used to allow host operating systems to unprovision storage on the storage controller, which means that the resources can automatically be freed up in the storage pools and used for other purposes.

A SCSI unmappable volume is a volume that can have storage unprovision and space reclamation being triggered by the host operating system. With the release of V8.1 code, the SCSI unmap command is passed through to back-end storage controllers that support the function.

Note: Some host types will respond to this by issuing WRITE SAME UNMAP commands, generating large amounts of I/O. Offload throttling must be enabled before upgrading to V8.1 to prevent this extra workload from overloading MDisks. For more information, see:

https://www.ibm.com/support/docview.wss?uid=ssg1S1010697

10.5 Real-time Compression

The IBM RtC Software that is embedded in the IBM Spectrum Virtualize addresses the requirements for primary storage data reduction, including performance. It does so by using a purpose-built technology, called Real-time Compression, that uses the Random Access Compression Engine (RACE). It offers the following benefits:

•Compression for active primary data

IBM Real-time Compression can be used with active primary data. Therefore, it supports workloads that are not candidates for compression in other solutions. The solution supports online compression of existing data. Storage administrators can regain free disk space in an existing storage system without requiring administrators and users to clean up or archive data.

This configuration significantly enhances the value of existing storage assets, and the benefits to the business are immediate. The capital expense of upgrading or expanding the storage system is delayed.

•Compression for replicated or mirrored data

Remote volume copies can be compressed, in addition to the volumes at the primary storage tier. This process reduces storage requirements in Metro Mirror and Global Mirror destination volumes as well.

•No changes to the existing environment are required

IBM Real-time Compression is part of the storage system. It was designed with transparency in mind so that it can be implemented without changes to applications, hosts, networks, fabrics, or external storage systems. The solution is not apparent to hosts, so users and applications continue to work as-is. Compression occurs within the IBM Spectrum Virtualize.

•Overall savings in operational expenses

More data is stored in a rack space, so fewer storage expansion enclosures are required to store a data set. This reduced rack space has the following benefits:

– Reduced power and cooling requirements. More data is stored in a system, which requires less power and cooling per gigabyte or used capacity.

– Reduced software licensing for more functions in the system. More data that is stored per enclosure reduces the overall spending on licensing.

Tip: Implementing compression in IBM Spectrum Virtualize provides the same benefits to internal Flash drives and externally virtualized storage systems.

•Disk space savings are immediate

The space reduction occurs when the host writes the data. This process is unlike other compression solutions in which some or all of the reduction is realized only after a post-process compression batch job is run.

Demonstration: The IBM Client Demonstration Center shows how easy it is to reduce your data footprint in the Data Reduction: reduce easily data footprint on your existing “IBM Spectrum Virtualize based” storage with IBM Real-Time Compression demo available at:

https://ibm.biz/Bdjhzx

10.5.1 Common use cases

This section addresses the most common use cases for implementing compression:

•General-purpose volumes

•Databases

•Virtualized infrastructures

•Log server data stores

General-purpose volumes

Most general-purpose volumes are used for highly compressible data types, such as home directories, CAD/CAM, oil and gas geo-seismic data, and log data. Storing such types of data in compressed volumes provides immediate capacity reduction to the overall used space. More space can be provided to users without any change to the environment.

Many file types can be stored in general-purpose servers. However, for practical information, the estimated compression ratios are based on actual field experience. Expected compression ratios are 50% to 60%.

File systems that contain audio, video files, and compressed files are not good candidates for compression. The overall capacity savings on these file types are minimal.

Databases

Database information is stored in table space files. It is common to observe high compression ratios in database volumes. Examples of databases that can greatly benefit from Real-Time Compression are IBM DB2, Oracle, and Microsoft SQL Server. Expected compression ratios are 50% to 80%.

Important: Some databases offer optional built-in compression. Generally, do not compress already compressed database files.

Virtualized infrastructures

The proliferation of open systems virtualization in the market has increased the use of storage space, with more virtual server images and backups kept online. The use of compression reduces the storage requirements at the source.

Examples of virtualization solutions that can greatly benefit from Real-time Compression are VMware, Microsoft Hyper-V, and KVM. Expected compression ratios are 45% to 75%.

Tip: Virtual machines with file systems that contain compressed files are not good candidates for compression, as described in “Databases”.

Log server data stores

Logs are a critical part for any information technology (IT) department in any organization. Log aggregates or syslog servers are a central point for the administrators, and immediate access and a smooth work process is necessary. Log server data stores are good candidates for Real-time Compression. Expected compression ratios are up to 90%.

10.5.2 Real-time Compression concepts

RACE technology is based on over 50 patents that are not primarily about compression. Instead, they define how to make industry-standard Lempel-Ziv (LZ) compression of primary storage operate in real-time and allow random access. The primary intellectual property behind this technology is the RACE engine.

At a high level, the IBM RACE component compresses data that is written into the storage system dynamically. This compression occurs transparently, so Fibre Channel and iSCSI connected hosts are not aware of the compression. RACE is an inline compression technology, meaning that each host write is compressed as it passes through the IBM Spectrum Virtualize to the disks. This technique has a clear benefit over other compression technologies that are post-processing based.

Those technologies do not provide immediate capacity savings. Therefore, they are not a good fit for primary storage workloads, such as databases and active data set applications.

RACE is based on the Lempel-Ziv lossless data compression algorithm and operates using a real-time method. When a host sends a write request, it is acknowledged by the write cache of the system, and then staged to the storage pool. As part of its staging, it passes through the compression engine and is then stored in compressed format onto the storage pool. Therefore, writes are acknowledged immediately after they are received by the write cache, with compression occurring as part of the staging to internal or external physical storage.

Capacity is saved when the data is written by the host because the host writes are smaller when they are written to the storage pool. IBM Real-time Compression is a self-tuning solution. It is adapting to the workload that runs on the system at any particular moment.

10.5.3 Random Access Compression Engine

To understand why RACE is unique, you need to review the traditional compression techniques. This description is not about the compression algorithm itself, that is, how the data structure is reduced in size mathematically. Rather, the description is about how the data is laid out within the resulting compressed output.

Compression utilities

Compression is probably most known to users because of the widespread use of compression utilities. At a high level, these utilities take a file as their input and parse the data by using a sliding window technique. Repetitions of data are detected within the sliding window history, most often 32 KiB. Repetitions outside of the window cannot be referenced. Therefore, the file cannot be reduced in size unless data is repeated when the window “slides” to the next 32 KiB slot.

Figure 10-18 shows compression that uses a sliding window, where the first two repetitions of the string “ABCD” fall within the same compression window, and can therefore be compressed by using the same dictionary. The third repetition of the string falls outside of this window, and therefore cannot be compressed by using the same compression dictionary as the first two repetitions, reducing the overall achieved compression ratio.

Figure 10-18 Compression that uses a sliding window

Traditional data compression in storage systems

The traditional approach taken to implement data compression in storage systems is an extension of how compression works in the previously mentioned compression utilities. Similar to compression utilities, the incoming data is broken into fixed chunks, and then each chunk is compressed and extracted independently.

However, there are drawbacks to this approach. An update to a chunk requires a read of the chunk followed by a recompression of the chunk to include the update. The larger the chunk size chosen, the heavier the I/O penalty to recompress the chunk. If a small chunk size is chosen, the compression ratio is reduced because the repetition detection potential is reduced.

Figure 10-19 shows an example of how the data is broken into fixed-size chunks (in the upper-left side of the figure). It also shows how each chunk gets compressed independently into variable length compressed chunks (in the upper-right side of the figure). The resulting compressed chunks are stored sequentially in the compressed output.

Although this approach is an evolution from compression utilities, it is limited to low-performance use cases. This limitation is mainly because it does not provide real random access to the data.

Figure 10-19 Traditional data compression in storage systems

Random Access Compression Engine

The IBM patented RACE implements an inverted approach when compared to traditional approaches to compression. RACE uses variable-size chunks for the input, and produces fixed-size chunks for the output.

This method enables an efficient and consistent way to index the compressed data because it is stored in fixed-size containers (Figure 10-20).

Figure 10-20 Random Access Compression

Location-based compression

Both compression utilities and traditional storage systems compression compress data by finding repetitions of bytes within the chunk that is being compressed. The compression ratio of this chunk depends on how many repetitions can be detected within the chunk. The number of repetitions is affected by how much the bytes stored in the chunk are related to each other. The relation between bytes is driven by the format of the object. For example, an office document might contain textual information, and an embedded drawing (like this page).

Because the chunking of the file is arbitrary, it has no concept of how the data is laid out within the document. Therefore, a compressed chunk can be a mixture of the textual information and part of the drawing. This process yields a lower compression ratio because the different data types mixed together cause a suboptimal dictionary of repetitions. That is, fewer repetitions can be detected because a repetition of bytes in a text object is unlikely to be found in a drawing.

This traditional approach to data compression is also called location-based compression. The data repetition detection is based on the location of data within the same chunk.

This challenge was addressed with the predecide mechanism introduced from V7.1.

Predecide mechanism

Some data chunks have a higher compression ratio than others. Compressing some of the chunks saves little space, but still requires resources, such as processor (CPU) and memory. To avoid spending resources on uncompressible data, and to provide the ability to use a different, more effective (in this particular case) compression algorithm, IBM has invented a predecide mechanism that was first introduced in V7.1.

The chunks that are below a given compression ratio are skipped by the compression engine, therefore saving CPU time and memory processing. Chunks that are not compressed with the main compression algorithm, but that still can be compressed well with the other, are marked and processed accordingly. The result might vary because predecide does not check the entire block, only a sample of it.

Figure 10-21 shows how the detection mechanism works.

Figure 10-21 Detection mechanism

Temporal compression

RACE offers a technology leap beyond location-based compression, called temporal compression. When host writes arrive to RACE, they are compressed and fill up fixed size chunks, also called compressed blocks. Multiple compressed writes can be aggregated into a single compressed block. A dictionary of the detected repetitions is stored within the compressed block.

When applications write new data or update existing data, it is typically sent from the host to the storage system as a series of writes. Because these writes are likely to originate from the same application and be of the same data type, more repetitions are usually detected by the compression algorithm. This type of data compression is called temporal compression because the data repetition detection is based on the time the data was written into the same compressed block.

Temporal compression adds the time dimension that is not available to other compression algorithms. It offers a higher compression ratio because the compressed data in a block represents a more homogeneous set of input data.

Figure 10-22 shows how three writes sent one after the other by a host end up in different chunks. They get compressed in different chunks because their location in the volume is not adjacent. This process yields a lower compression ratio because the same data must be compressed non-natively by using three separate dictionaries.

Figure 10-22 Location-based compression

When the same three writes are sent through RACE, as shown on Figure 10-23, the writes are compressed together by using a single dictionary. This process yields a higher compression ratio than location-based compression.

Figure 10-23 Temporal compression

10.5.4 Dual RACE instances

In V7.4, the compression code was enhanced by the addition of a second RACE instance per node. This feature takes advantage of multi-core processor architecture, and uses the compression accelerator cards more effectively. The second RACE instance works in parallel with the first instance, as shown in Figure 10-24.

Figure 10-24 Dual RACE architecture

With dual RACE enhancement, the compression performance can be boosted up to two times for compressed workloads when compared to previous versions.

To take advantage of dual RACE, several software and hardware requirements must be met:

•The software must be at or above V7.4.

•Only Storwize V7000 Gen2 is supported.

Tip: Use two compression accelerator cards for the best performance.

When using the dual RACE feature, the acceleration cards are shared between RACE instances, which means that the acceleration cards are used simultaneously by both RACE instances. The rest of the resources, such as processor (CPU) cores and random access memory (RAM), are evenly divided between the RACE components.

You do not need to manually enable dual RACE. Dual RACE runs automatically when all minimal software and hardware requirements are met. If the Storwize V7000 Gen2 is compression capable but the minimal requirements for dual RACE are not met, only one RACE instance is used (as with earlier versions of code).

10.5.5 Random Access Compression Engine in IBM Spectrum Virtualize stack

It is important to understand where the RACE technology is implemented in the IBM Spectrum Virtualize stack. This location determines how it applies to other Storwize components.

RACE technology is implemented into the Storwize thin provisioning layer, and is an organic part of the stack. The IBM Spectrum Virtualize stack is shown in Figure 10-25. Compression is transparently integrated with existing system management design. All of the IBM Spectrum Virtualize advanced features are supported on compressed volumes. You can create, delete, migrate, map (assign), and unmap (unassign) a compressed volume as though it were a fully allocated volume.

In addition, you can use Real-time Compression along with EasyTier on the same volumes. This compression method provides nondisruptive conversion between compressed and decompressed volumes. This conversion provides a uniform user-experience and eliminates the need for special procedures when dealing with compressed volumes.

Figure 10-25 RACE integration within IBM Spectrum Virtualize stack

10.5.6 Data write flow

When a host sends a write request to Storwize V7000, it reaches the upper cache layer. The host is immediately sent an acknowledgment of its I/Os.

When the upper cache layer destages to the RACE, the I/Os are sent to the thin-provisioning layer. They are then sent to RACE, and if necessary, to the original host write or writes. The metadata that holds the index of the compressed volume is updated if needed, and is compressed as well.

10.5.7 Data read flow

When a host sends a read request to the Storwize V7000 for compressed data, it is forwarded directly to the Real-time Compression component:

•If the Real-time Compression component contains the requested data, Storwize V7000 cache replies to the host with the requested data without having to read the data from the lower-level cache or disk.

•If the Real-time Compression component does not contain the requested data, the request is forwarded to the Storwize V7000 lower-level cache.

•If the lower-level cache contains the requested data, it is sent up the stack and returned to the host without accessing the storage.

•If the lower-level cache does not contain the requested data, it sends a read request to the storage for the requested data.

10.5.8 Compression of existing data

In addition to compressing data in real time, you can also compress existing data sets (convert volume to compressed). To do so, you must change the capacity savings settings of the volume by completing these steps:

1. Right-click a volume and select Modify Capacity Settings, as shown in Figure 10-26.

Figure 10-26 Modifying Capacity Settings

2. In the menu, select Compression as the Capacity Savings option, as shown in Figure 10-27.

Figure 10-27 Selecting Capacity Setting

After the copies are fully synchronized, the original volume copy is deleted automatically.

As a result, you have compressed data on the existing volume. This process is nondisruptive, so the data remains online and accessible by applications and users.

With virtualization of external storage systems, the ability to compress already stored data significantly enhances and accelerates the benefit to users. It enables them to see a tremendous return on their Storwize V7000 investment. On initial purchase of a Storwize V7000 with Real-time Compression, customers can defer their purchase of new storage. As new storage needs to be acquired, IT purchases a lower amount of the required storage before compression.

10.5.9 Configuring compressed volumes

To use compression on the Storwize V7000, licensing is required. With the Storwize V7000, Real-time Compression is licensed by capacity, per terabyte of virtual data.

For information about creating a compressed volume, see Chapter 7, “Volumes” on page 239.

10.5.10 Comprestimator

The “Comprestimator” utility to estimate expected compression ratios on existing volumes has been built in to IBM Spectrum Virtualize since V7.6.

The built-in Comprestimator is a command-line function that analyzes an existing volume and provides output showing an estimate of expected compression rate.

Comprestimator uses advanced mathematical and statistical algorithms to perform the sampling and analysis process in a short and efficient way of online volumes. The utility also displays its accuracy level by showing the maximum error range of the results achieved based on the formulas that it uses.

The following commands are available:

•The analyzevdisk command provides an option to analyze a single volume.

Usage: analyzevdisk <volume ID>
Example: analyzevdisk 0

This command can be canceled by running the analyzevdisk <volume ID> -cancel command.

•The lsvdiskanalysis command provides a list and the status of the volumes. Some of them can be analyzed already, some of them not yet. The command can either be used for all volumes on the system or it can be used per volume, similar to lsvdisk. See Example 10-6.

Example 10-6 Example of the command run over one volume with ID 0

IBM_2076:ITSO Gen2:superuser>lsvdiskanalysis 0

id 0

name SQL_Data0

state estimated

started_time 151012104343

analysis_time 151012104353

capacity 300.00GB

thin_size 290.85GB

thin_savings 9.15GB

thin_savings_ratio 3.05

compressed_size 141.58GB

compression_savings 149.26GB

compression_savings_ratio 51.32

total_savings 158.42GB

total_savings_ratio 52.80

accuracy 4.97

The state parameter can have the following values:

– idle. Was never estimated and not currently scheduled.

– scheduled. Volume is queued for estimation, and will be processed based on lowest volume ID first.

– active. Volume is being analyzed.

– canceling. Volume was requested to cancel an active analysis, and analysis was not yet canceled.

– estimated. Volume was analyzed and results show the expected savings of thin provisioning and compression.

– sparse. Volume was analyzed but Comprestimator could not find enough nonzero samples to establish a good estimation.

The compression_savings_ratio is the estimated amount of space that can be saved on the storage in the frame of this specific volume expressed as a percentage.

•The analyzevdiskbysystem command provides an option to run Comprestimator on all volumes within the system. The analyzing process is nondisruptive and should not affect the system significantly. Analysis speed might vary depending on the fullness of the volume, but should not take more than a few minutes per volume.

This process can be canceled by running the analyzevdiskbysystem -cancel command.

•The lsvdiskanalysisprogress command shows the progress of the Comprestimator analysis as shown in Example 10-7.

Example 10-7 Comprestimator progress

id vdisk_count pending_analysis estimated_completion_time

0 45 12 151012154400

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10. Advanced features for storage efficiency

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 10. Advanced features for storage efficiency