Chapter 10. Advanced features for storage efficiency

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Advanced features for storage efficiency

This chapter introduces the basic concepts of dynamic data relocation and storage optimization features. IBM Spectrum Virtualize running inside the IBM SAN Volume Controller (SVC) offers several functions for storage efficiency. It provides a basic technical overview and the benefits of each feature. For more information about planning and configuration, see the following publications:

•IBM EasyTier:

– Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072

– IBM DS8000 EasyTier (for DS8880 R8.3 or later), REDP-4667 (similar concept to SVC EasyTier)

– IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521

•Thin provisioning:

– Thin Provisioning in an IBM SAN or IP SAN Enterprise Environment, REDP-4265

– DS8000 Thin Provisioning, REDP-4554 (similar concept to IBM SAN Volume Controller thin provisioning)

•IBM Real-time Compression (RtC):

– IBM Real-time Compression in IBM SAN Volume Controller and IBM Storwize V7000, REDP-4859

– Implementing IBM Real-time Compression in SAN Volume Controller and IBM Storwize V7000, TIPS1083

– Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072

This chapter includes the following topics:

•Introduction

•EasyTier

•Thin provisioning

•Unmap

•Real-time Compression

10.1 Introduction

In modern and complex application environments, the increasing and often unpredictable demands for storage capacity and performance lead to issues of planning and optimization of storage resources.

Consider the following typical storage management issues:

•Usually when a storage system is implemented, only a portion of the configurable physical capacity is deployed. When the storage system runs out of the installed capacity and more capacity is requested, a hardware upgrade is implemented to add physical resources to the storage system. It is difficult to configure this new physical capacity to keep an even spread of the overall storage resources.

Typically, the new capacity is allocated to fulfill only new storage requests. The existing storage allocations do not benefit from the new physical resources. Similarly, the new storage requests do not benefit from the existing resources. Only new resources are used.

•In a complex production environment, it is not always possible to optimize storage allocation for performance. The unpredictable rate of storage growth and the fluctuations in throughput requirements, which are input/output operations per second (IOPS), often lead to inadequate performance.

Furthermore, the tendency to use even larger volumes to simplify storage management works against the granularity of storage allocation, and a cost-efficient storage tiering solution becomes difficult to achieve. With the introduction of high performing technologies, such as Flash drives or all flash arrays, this challenge becomes even more important.

•The move to larger and larger physical disk drive capacities means that previous access densities that were achieved with low-capacity drives can no longer be sustained.

•Any business has applications that are more critical than others, and a need exists for specific application optimization. Therefore, the ability to relocate specific application data to a faster storage media is needed.

•Although more servers are purchased with internal SSD drives for better application response time, the data distribution across these internal SSDs and external storage arrays must be carefully planned. An integrated and automated approach is crucial to achieve performance improvement without compromise to data consistency, especially in a disaster recovery (DR) situation.

All of these issues deal with data placement and relocation capabilities or data volume reduction. Most of these challenges can be managed by having spare resources available, moving data, and by using data mobility tools or operating systems features (such as host level mirroring) to optimize storage configurations.

However, all of these corrective actions are expensive in terms of hardware resources, labor, and service availability. Relocating data among the physical storage resources that dynamically or effectively reduces the amount of data, transparently to the attached host systems, is becoming increasingly important.

10.2 EasyTier

In today’s storage market, Flash drives and Flash arrays are emerging as an attractive alternative to hard disk drives (HDDs). Because of their low response times, high throughput, and IOPS-energy-efficient characteristics, Flash drives and Flash arrays have the potential to allow your storage infrastructure to achieve significant savings in operational costs.

However, the current acquisition cost per gibibyte (GiB) for Flash drives or Flash array is higher than for HDDs. Flash drive and flash array performance depends on workload characteristics. Therefore, they should be used with HDDs for optimal cost / performance.

Choosing the correct mix of drives and the correct data placement is critical to achieve optimal performance at low cost. Maximum value can be derived by placing “hot” data with high input/output (I/O) density and low response time requirements on Flash drives or Flash arrays, and targeting HDDs for “cooler” data that is accessed more sequentially and at lower rates.

EasyTier automates the placement of data among different storage tiers and it can be enabled for internal and external storage. This IBM Spectrum Virtualize feature boosts your storage infrastructure performance to achieve optimal performance through a software, server, and storage solution.

Additionally, the no-charge feature called storage pool balancing, introduced in the IBM Spectrum Virtualize V7.3, automatically moves extents within the same storage tier, from overloaded to less loaded managed disks (MDisks). Storage pool balancing ensures that your data is optimally placed among all MDisks within a storage pool.

10.2.1 EasyTier concepts

IBM Spectrum Virtualize implements EasyTier enterprise storage functions, which were originally available on IBM System Storage DS8000® and IBM XIV enterprise class storage systems. It enables automated subvolume data placement throughout different (or within the same) storage tiers to intelligently align the system with current workload requirements, and to optimize the usage of Flash drives or flash arrays.

This function includes the ability to automatically and non-disruptively relocate data (at the extent level) from one tier to another tier (or even within the same tier), in either direction. This feature helps achieve the best available storage performance for your workload in your environment. EasyTier reduces the I/O latency for hot spots, but it does not replace storage cache.

Both EasyTier and storage cache solve a similar access latency workload problem. However, these two methods weigh differently in the algorithmic construction that is based on locality of reference, recency, and frequency. Because EasyTier monitors I/O performance from the device end (after cache), it can pick up the performance issues that cache cannot solve, and complement the overall storage system performance.

Figure 10-1 shows placement of the EasyTier engine within the IBM Spectrum Virtualize software stack.

Figure 10-1 EasyTier in the IBM Spectrum Virtualize software stack

In general, the storage environments’ I/O is monitored at a volume level, and the entire volume is always placed inside one appropriate storage tier. Determining the amount of I/O, moving part of the underlying volume to an appropriate storage tier, and reacting to workload changes are too complex for manual operation. This area is where the EasyTier feature can be used.

EasyTier is a performance optimization function because it automatically migrates (or moves) extents that belong to a volume between different storage tiers (Figure 10-2 on page 411) or rebalancing within the same storage tier (Figure 10-6 on page 415). Because this migration works at the extent level, it is often referred to as sub-logical unit number (LUN) migration. Movement of the extents is online and unnoticed from the host point of view. As a result of extent movement, the volume no longer has all its data in one tier, but rather in two or three tiers.

Figure 10-2 shows the basic EasyTier principle of operation.

Figure 10-2 EasyTier

You can enable EasyTier on a volume basis. It monitors the I/O activity and latency of the extents on all EasyTier enabled volumes over a 24-hour period. Based on the performance log, EasyTier creates an extent migration plan and dynamically moves (promotes) high activity or hot extents to a higher disk tier within the same storage pool.

It also moves (demotes) extents whose activity dropped off, or cooled, from a higher disk tier MDisk back to a lower tier MDisk. When EasyTier runs in a storage pool rebalance mode, it moves extents from busy MDisks to less busy MDisks of the same type.

10.2.2 Flash drive arrays and flash MDisks

The Flash drives or flash arrays are treated no differently by the IBM SVC than normal HDDs regarding Redundant Array of Independent Disks (RAID) or MDisks. The individual Flash drives in the storage that is managed by the SVC are combined into an array, usually in RAID 10 or RAID 5 format. A LUN is created on the array, and then presented to the SVC as a normal MDisk.

As is the case for HDDs, the Flash drive RAID array format helps to protect against individual Flash drive failures. Depending on your requirements, you can achieve more high availability (HA) protection, beyond the RAID level, by using volume mirroring.

The internal storage configuration of flash arrays can differ depending on an array vendor. Regardless of the methods that are used to configure flash-based storage, the flash system maps a volume to a host, in this case to the SVC. From the SVC perspective, a volume that is presented from flash storage is also seen as a normal managed disk.

Starting with SVC 2145-DH8 nodes and IBM Storwize V7.3, up to two expansion drawers can be connected to the one SVC I/O Group. Each drawer can have up to 24 SDDs, and only SDD drives are supported. The SDD drives are then gathered together to form RAID arrays in the same way that RAID arrays are formed in the IBM Storwize systems.

After creation of a Flash drive array, it appears as an MDisk but with a tier of tier0_flash or a RI Flash array will appear with a tier of tier1_flash, which differs from MDisks presented from external storage systems. Because Spectrum Virtualize cannot know from what type of physical drives that the presented MDisk is formed from, the default MDisk tier that SVC adds to each external MDisk is tier_enterprise. It is up to the user or administrator to change the tier of each external MDisk to tier0_flash, tier1_flash, tier_enterprise, or tier_nearline as appropriate.

To change a tier of an MDisk in the CLI, use the chmdisk command, as shown in Example 10-1.

Example 10-1 Changing the MDisk tier

IBM_2145:ITSO_SVC2:superuser>lsmdisk -delim " "

id name status mode mdisk_grp_id mdisk_grp_name capacity ctrl_LUN_# controller_name UID tier encrypt

0 mdisk0 online unmanaged 100.0GB 0000000000000000 controller0 6005076400820008380000000000000000000000000000000000000000000000 tier_enterprise no

1 mdisk1 online unmanaged 100.0GB 0000000000000001 controller0 6005076400820008380000000000000100000000000000000000000000000000 tier_enterprise no

IBM_2145:ITSO_SVC2:superuser>chmdisk -tier nearline mdisk0

IBM_2145:ITSO_SVC2:superuser>lsmdisk -delim " "

id name status mode mdisk_grp_id mdisk_grp_name capacity ctrl_LUN_# controller_name UID tier encrypt

0 mdisk0 online unmanaged 100.0GB 0000000000000000 controller0 6005076400820008380000000000000000000000000000000000000000000000 tier_nearline no

1 mdisk1 online unmanaged 100.0GB 0000000000000001 controller0 6005076400820008380000000000000100000000000000000000000000000000 tier_enterprise no

It is also possible to change the MDisk tier from the graphical user interface (GUI) but this technique can only be used for external MDisks. To change the tier, complete these steps:

1. Click Pools → External Storage and click the expand sign (>) next to the controller that owns the MDisks for which you want to change the tier.

2. Right-click the target MDisk and select Modify Tier (Figure 10-3).

Figure 10-3 Change the MDisk tier

3. A new window opens with options to change the tier (Figure 10-4).

Figure 10-4 Selecting the MDisk tier

The tier change happens online and has no effect on host or volumes availability.

4. If you do not see the Tier column, click the symbol at the end of the title row and select the Tier check box, as shown in Figure 10-5.

Figure 10-5 Customizing title row to show tier column

10.2.3 Disk tiers

The MDisks (LUNs) that are presented to the SVC cluster are likely to have different performance attributes because of the type of disk or RAID array on which they are located. The MDisks can be created on any of the following hardware:

•15,000 RPM FC or SAS drives

•10,000 RPM FC or SAS drives

•Nearline SAS or SATA drives

•Flash drives or flash storage systems such as the IBM FlashSystem 900

•Read Intensive Flash drives

The SVC does not automatically detect the type of MDisks, except for MDisks that are formed of Flash drives from integrated expansion drawers. Instead, all external MDisks are initially put into the enterprise tier by default. Then, the administrator must manually change the tier of MDisks and add them to storage pools. Depending on what type of disks are gathered to form a storage pool, the following types of storage pools are distinguished:

•Single-tier

•Multitier

Single-tier storage pools

Figure 10-6 shows a scenario in which a single storage pool is populated with MDisks that are presented by an external storage controller. In this solution, the striped volumes can be measured by EasyTier, and can benefit from storage pool balancing mode, which moves extents between MDisks of the same type.

Figure 10-6 Single tier storage pool with striped volume

MDisks that are used in a single-tier storage pool should have the same hardware characteristics, for example, the same RAID type, RAID array size, disk type, disk RPM, and controller performance characteristics.

Multitier storage pools

A multitier storage pool has a mix of MDisks with more than one type of disk tier attribute. For example, a storage pool that contains a mix of enterprise and Flash drive MDisks or enterprise and NL-SAS MDisk.

Figure 10-7 shows a scenario in which a storage pool is populated with several different MDisk types:

•One belonging to a Flash drive array

•One belonging to SAS HDD array

•One belonging to an NL-SAS HDD array

Although this example shows RAID 5 arrays, other RAID types can be used as well.

Figure 10-7 Multitier storage pool with striped volumes

Adding Flash drives to the pool also means that more space is now available for new volumes or volume expansion.

Note: Image mode and sequential volumes are not candidates for EasyTier automatic data placement because all extents for those types of volumes must be on one specific MDisk and cannot be moved.

The EasyTier setting can be changed on a storage pool and volume level. Depending on the EasyTier setting and the number of tiers in the storage pool, EasyTier services might function in a different way.

Table 10-1 shows possible combinations of EasyTier settings.

Table 10-1 EasyTier settings

Storage pool EasyTier setting	Number of tiers in the storage pool	Volume copy EasyTier setting	Volume copy EasyTier status
Off	One	Off	inactive (see note 2)
Off	One	On	inactive (see note 2)
Off	Two or three	Off	inactive (see note 2)
Off	Two or three	On	inactive (see note 2)
Measure	One	Off	measured (see note 3)
Measure	One	On	measured (see note 3)
Measure	Two or three	Off	measured (see note 3)
Measure	Two or three	On	measured (see note 3)
Auto	One	Off	measured (see note 3)
Auto	One	On	balanced (see note 4)
Auto	Two or three	Off	measured (see note 3)
Auto	Two or three	On	active (see note 5)
On	One	Off	measured (see note 3)
On	One	On	balanced (see note 4)
On	Two or three	Off	measured (see note 3)
On	Two or three	On	active (see note 5)

Table notes:

1. If the volume copy is in image or sequential mode or is being migrated, the volume copy EasyTier status is measured rather than active.

2. When the volume copy status is inactive, no EasyTier functions are enabled for that volume copy.

3. When the volume copy status is measured, the EasyTier function collects usage statistics for the volume, but automatic data placement is not active.

4. When the volume copy status is balanced, the EasyTier function enables performance-based pool balancing for that volume copy.

5. When the volume copy status is active, the EasyTier function operates in automatic data placement mode for that volume.

6. The default EasyTier setting for a storage pool is Auto, and the default EasyTier setting for a volume copy is On. Therefore, EasyTier functions, except pool performance balancing, are disabled for storage pools with a single tier. Automatic data placement mode is enabled by default for all striped volume copies in a storage pool with two or more tiers.

Table 10-2 shows the naming convention and all supported combinations of storage tiering that are used by EasyTier.

Table 10-2 EasyTier supported storage pools

Tier 0	Tier 1	Tier 2
Three Tier Pool:
tier0_flash	tier_enterprise	tier_nearline
tier0_flash	tier1_flash	tier_enterprise
tier0_flash	tier1_flash	tier_nearline
tier1_flash	tier_enterprise	tier_nearline
Two Tier Pool:
tier0_flash	tier1_flash
tier0_flash	tier_enterprise
tier0_flash	tier_nearline
	tier1_flash	tier_enterprise
	tier1_flash	tier_nearline
	tier_enterprise	tier_nearline
Single Tier Pool:
tier0_flash
	tier1_flash
	tier_enterprise
		tier_nearline

10.2.4 Read Intensive Flash drives and EasyTier

One of the reasons why flash technology is still quite expensive when compared to traditional HDD is that an over provisioning of the physical memory is provided to mitigate the Write Amplification issue (https://en.wikipedia.org/wiki/Write_amplification). Read-Intensive (RI) Flash drives are lower-cost Flash drives with the cost reduction being achieved by having less redundant flash material.

Read Intensive flash drive support for Spectrum Virtualize/Storwize systems was introduced with V7.7 and then enhanced with V7.8 introducing, among other things, EasyTier support for RI MDisks.

Even though EasyTier remains a three tier storage architecture, 7.8 added a new tier specifically for the RI MDisks. From a user perspective, then, there are now four tiers:

•T0 or tier0_flash that represents the enterprise flash technology

•T1 or tier1_flash that represents the RI flash drive technology

•T2 or tier_enterprise that represents the enterprise HDD technology

•T3 or tier_nearline that represents the nearline HDD technology

These user tiers are mapped to EasyTier tiers depending on the pool configuration. Figure 10-8 shows the possible combinations for the pool configuration with four user tiers (the configurations containing the RI user tier is highlighted in orange).

Figure 10-8 EasyTier mapping policy

The table columns represent all the possible pool configurations, while the rows represent reports in which EasyTier tier each user tier is mapped. For example, consider a pool with all the possible tiers configured that corresponds with the T0+T1+T2+T3 configuration in the table. With this configuration, the T1 and T2 are mapped to the same EasyTier tier (tier 2). Note that the Tier1_flash tier is only mapped to EasyTier 1 or 2 tier.

For more information about planning and configuration considerations or best practices, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521, and Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072.

For more information about RI Flash drives, see Read Intensive Flash Drives, REDP-5380.

10.2.5 EasyTier process

The EasyTier function includes the following main processes:

•I/O Monitoring

This process operates continuously, and monitors volumes for host I/O activity. It collects performance statistics for each extent, and derives averages for a rolling 24-hour period of I/O activity.

EasyTier makes allowances for large block I/Os. Therefore, it considers only I/Os of up to 64 kibibytes (KiB) as migration candidates.

This process is efficient and adds negligible processing resource requirements to the SVC nodes.

•Data Placement Advisor

The Data Placement Advisor uses workload statistics to make a cost/benefit decision as to which extents are to be candidates for migration to a higher performance tier.

This process also identifies extents that must be migrated back to a lower tier.

•Data Migration Planner (DMP)

By using the extents that were previously identified, the Data Migration Planner builds the extent migration plans for the storage pool. The Data Migration Planner builds two plans:

– Automatic Data Relocation (ADR mode) plan to migrate extents across adjacent tiers

– Rebalance (RB mode) plan to migrate extents within the same tier

•Data Migrator

This process involves the actual movement or migration of the volume’s extents up to, or down from, the higher disk tier. The extent migration rate is capped so that a maximum of up to 30 megabytes per second (MBps) is migrated, which equates to approximately 3 terabytes (TB) per day that is migrated between disk tiers.

When enabled, EasyTier performs the following actions between the three tiers presented in Figure 10-9 on page 421:

•Promote

Moves the relevant hot extents to a higher performing tier.

•Swap

Exchange cold extent in upper tier with hot extent in lower tier.

•Warm demote

– Prevents performance overload of a tier by demoting a warm extent to the lower tier.

– Triggered when bandwidth or IOPS exceeds predefined threshold.

•Demote or cold demote

Coldest data is moved to a lower tier. Only supported between HDD tiers.

•Expanded cold demote

Demotes appropriate sequential workloads to the lowest tier to better use nearline tier bandwidth.

•Storage pool balancing

– Redistribute extents within a tier to balance usage across MDisks for maximum performance.

– Moves hot extents from high usage MDisks to low usage MDisks.

– Exchanges extents between high usage MDisks and low usage MDisks.

•EasyTier attempts to migrate the most active volume extents up to flash tier first.

•A previous migration plan and any queued extents that are not yet relocated are abandoned.

Note: Extent promotion / demotion only occurs between adjacent tiers. In a three-tier storage pool, EasyTier does not move extents from a flash tier directly to nearline tier or vice versa without moving to the enterprise tier first.

EasyTier extent migration types are presented in Figure 10-9.

Figure 10-9 EasyTier extent migration types

10.2.6 EasyTier operating modes

EasyTier includes the following main operating modes:

•Off

•Evaluation or measurement only

•Automatic data placement or extent migration

•Storage pool balancing

EasyTier off mode

With EasyTier turned off, no statistics are recorded and no cross-tier extent migration occurs. Also, with EasyTier turned off, no storage pool balancing across MDisks in the same tier is performed, even in single tier pools.

Evaluation or measurement only mode

EasyTier evaluation or measurement only mode collects usage statistics for each extent in a single-tier storage pool where the EasyTier value is set to On for both the volume and the pool. This collection is typically done for a single-tier pool that contains only HDDs so that the benefits of adding Flash drives to the pool can be evaluated before any major hardware acquisition.

A dpa_heat.nodeid.yymmdd.hhmmss.data statistics summary file is created in the /dumps directory of the SVC nodes. This file can be offloaded from the SVC nodes with the PuTTY Secure Copy Client (PSCP) pscp -load command or by using the GUI, as described in IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521. A web browser is used to view the report that is created by the tool.

Automatic data placement or extent migration mode

In automatic data placement or extent migration operating mode, the storage pool parameter -easytier on or auto must be set, and the volumes in the pool must have -easytier on. The storage pool must also contain MDisks with different disk tiers, which makes it a multitiered storage pool.

Dynamic data movement is not apparent to the host server and application users of the data, other than providing improved performance. Extents are automatically migrated, as explained in “Implementation rules” on page 423. The statistic summary file is also created in this mode. This file can be offloaded for input to the advisor tool. The tool produces a report on the extents that are moved to a higher tier and a prediction of performance improvement that can be gained if more higher tier disks are available.

Options: The EasyTier function can be turned on or off at the storage pool level and at the volume level.

Storage pool balancing

Although storage pool balancing is associated with EasyTier, it operates independently of EasyTier and does not require an EasyTier license. This feature assesses the extents that are written in a pool, and balances them automatically across all MDisks within the pool. This process works along with EasyTier when multiple classes of disks exist in a single pool. In such cases, EasyTier moves extents between the different tiers, and storage pool balancing moves extents within the same tier, to better use MDisks.

The process automatically balances existing data when new MDisks are added into an existing pool even if the pool only contains a single type of drive. However, the process does not migrate extents from existing MDisks to achieve even extent distribution among all, old and new, MDisks in the storage pool. The EasyTier RB process within a tier migration plan is based on performance, and not on the capacity of the underlying MDisks.

Note: Storage pool balancing can be used to balance extents when mixing different size disks of the same performance tier. For example, when adding larger capacity drives to a pool with smaller capacity drives of the same class, storage pool balancing redistributes the extents to take advantage of the additional performance of the new MDisks.

10.2.7 Implementation considerations

EasyTier is a licensed feature, except for storage pool balancing, which is a no-charge feature that is enabled by default. EasyTier comes as part of the IBM Spectrum Virtualize code. For EasyTier to migrate extents between different tier disks, you must have disk storage available that offers different tiers, such as a mix of Flash drive and HDD. EasyTier performs storage pool balancing even if you have only a single-tier pool.

Implementation rules

Remember the following implementation and operational rules when you use the IBM System Storage EasyTier function on the SVC:

•EasyTier automatic data placement is not supported on image mode or sequential volumes. I/O monitoring for such volumes is supported, but you cannot migrate extents on these volumes unless you convert image or sequential volume copies to striped volumes.

•Automatic data placement and extent I/O activity monitors are supported on each copy of a mirrored volume. EasyTier works with each copy independently of the other copy.

Volume mirroring consideration: Volume mirroring can have different workload characteristics on each copy of the data because reads are normally directed to the primary copy and writes occur to both copies. Therefore, the number of extents that EasyTier migrates between the tiers might differ for each copy.

•If possible, the SVC creates volumes or expands volumes by using extents from MDisks from the tier_enterprise tier. However, it uses extents from MDisks from the tier0_flash or tier1_flash tiers, if necessary.

When a volume is migrated out of a storage pool that is managed with EasyTier, EasyTier automatic data placement mode is no longer active on that volume. Automatic data placement is also turned off while a volume is being migrated, even when it is between pools that both have EasyTier automatic data placement enabled. Automatic data placement for the volume is reenabled when the migration is complete.

Limitations

When you use EasyTier on the SVC, keep in mind the following limitations:

•Removing an MDisk by using the -force parameter

When an MDisk is deleted from a storage pool with the -force parameter, extents in use are migrated to MDisks in the same tier as the MDisk that is being removed, if possible. If insufficient extents exist in that tier, extents from the other tier are used.

•Migrating extents

When EasyTier automatic data placement is enabled for a volume, you cannot use the svctask migrateexts CLI command on that volume.

•Migrating a volume to another storage pool

When the SVC migrates a volume to a new storage pool, EasyTier automatic data placement between the two tiers is temporarily suspended. After the volume is migrated to its new storage pool, EasyTier automatic data placement between the tier0_flash and the tier_enterprise resumes for the moved volume, if appropriate.

When the SVC migrates a volume from one storage pool to another, it attempts to migrate each extent to an extent in the new storage pool from the same tier as the original extent. In several cases, such as where a target tier is unavailable, another tier is used. For example, the tier0_flash tier might be unavailable in the new storage pool.

•Migrating a volume to an image mode

EasyTier automatic data placement does not support image mode. When a volume with active EasyTier automatic data placement mode is migrated to an image mode, EasyTier automatic data placement mode is no longer active on that volume.

•Image mode and sequential volumes cannot be candidates for automatic data placement. However, EasyTier supports evaluation mode for image mode volumes.

10.2.8 Modifying the EasyTier setting

The EasyTier setting for storage pools and volumes can only be changed by using the command line. Use the chvdisk command to turn off or turn on EasyTier on selected volumes. Use the chmdiskgrp command to change the status of EasyTier on selected storage pools, as shown in Example 10-2.

Example 10-2 Changing the EasyTier setting

IBM_2145:ITSO SVC DH8:superuser>lsvdisk test

id 1

name test

IO_group_id 0

IO_group_name io_grp0

status online

mdisk_grp_id 0

mdisk_grp_name Pool0_Site1

capacity 1.00GB

type striped

formatted yes

formatting no

mdisk_id

mdisk_name

FC_id

FC_name

RC_id

RC_name

vdisk_UID 6005076801FF00840800000000000002

throttling 0

preferred_node_id 1

fast_write_state empty

cache readwrite

udid

fc_map_count 0

sync_rate 50

copy_count 1

se_copy_count 0

filesystem

mirror_write_priority latency

RC_change no

compressed_copy_count 0

access_IO_group_count 1

last_access_time

parent_mdisk_grp_id 0

parent_mdisk_grp_name Pool0_Site1

owner_type none

owner_id

owner_name

encrypt no

volume_id 1

volume_name test

function

copy_id 0

status online

sync yes

auto_delete no

primary yes

mdisk_grp_id 0

mdisk_grp_name Pool0_Site1

type striped

mdisk_id

mdisk_name

fast_write_state empty

used_capacity 1.00GB

real_capacity 1.00GB

free_capacity 0.00MB

overallocation 100

autoexpand

warning

grainsize

se_copy no

easy_tier off

easy_tier_status measured

tier tier0_flash

tier_capacity 0.00MB

tier tier1_flash

tier_capacity 0.00MB

tier tier_enterprise

tier_capacity 1.00GB

tier tier_nearline

tier_capacity 0.00MB

compressed_copy no

uncompressed_used_capacity 1.00GB

parent_mdisk_grp_id 0

parent_mdisk_grp_name Pool0_Site1

encrypt no

IBM_2145:ITSO SVC DH8:superuser>chvdisk -easytier on test

IBM_2145:ITSO SVC DH8:superuser>lsvdisk test

id 1

name test

IO_group_id 0

IO_group_name io_grp0

status online

mdisk_grp_id 0

mdisk_grp_name Pool0_Site1

capacity 1.00GB

type striped

formatted yes

formatting no

mdisk_id

mdisk_name

FC_id

FC_name

RC_id

RC_name

vdisk_UID 6005076801FF00840800000000000002

throttling 0

preferred_node_id 1

fast_write_state empty

cache readwrite

udid

fc_map_count 0

sync_rate 50

copy_count 1

se_copy_count 0

filesystem

mirror_write_priority latency

RC_change no

compressed_copy_count 0

access_IO_group_count 1

last_access_time

parent_mdisk_grp_id 0

parent_mdisk_grp_name Pool0_Site1

owner_type none

owner_id

owner_name

encrypt no

volume_id 1

volume_name test

function

copy_id 0

status online

sync yes

auto_delete no

primary yes

mdisk_grp_id 0

mdisk_grp_name Pool0_Site1

type striped

mdisk_id

mdisk_name

fast_write_state empty

used_capacity 1.00GB

real_capacity 1.00GB

free_capacity 0.00MB

overallocation 100

autoexpand

warning

grainsize

se_copy no

easy_tier on

easy_tier_status balanced

tier tier0_flash

tier_capacity 0.00MB

tier tier1_flash

tier_capacity 0.00MB

tier tier_enterprise

tier_capacity 1.00GB

tier tier_nearline

tier_capacity 0.00MB

compressed_copy no

uncompressed_used_capacity 1.00GB

parent_mdisk_grp_id 0

parent_mdisk_grp_name Pool0_Site1

encrypt no

IBM_2145:ITSO SVC DH8:superuser>lsmdiskgrp Pool0_Site1

id 0

name Pool0_Site1

status online

mdisk_count 4

vdisk_count 12

capacity 1.95TB

extent_size 1024

free_capacity 1.93TB

virtual_capacity 22.00GB

used_capacity 22.00GB

real_capacity 22.00GB

overallocation 1

warning 80

easy_tier auto

easy_tier_status balanced

tier tier0_flash

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier1_flash

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_mdisk_count 4

tier_capacity 1.95TB

tier_free_capacity 1.93TB

tier tier_nearline

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

compression_active no

compression_virtual_capacity 0.00MB

compression_compressed_capacity 0.00MB

compression_uncompressed_capacity 0.00MB

site_id 1

site_name ITSO_DC1

parent_mdisk_grp_id 0

parent_mdisk_grp_name Pool0_Site1

child_mdisk_grp_count 0

child_mdisk_grp_capacity 0.00MB

type parent

encrypt no

owner_type none

owner_id

owner_name

data_reduction no

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

reclaimable_capacity 0.00MB

IBM_2145:ITSO SVC DH8:superuser>chmdiskgrp -easytier off Pool0_Site1

IBM_2145:ITSO SVC DH8:superuser>lsmdiskgrp Pool0_Site1

id 0

name Pool0_Site1

status online

mdisk_count 4

vdisk_count 12

capacity 1.95TB

extent_size 1024

free_capacity 1.93TB

virtual_capacity 22.00GB

used_capacity 22.00GB

real_capacity 22.00GB

overallocation 1

warning 80

easy_tier off

easy_tier_status inactive

tier tier0_flash

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier1_flash

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_mdisk_count 4

tier_capacity 1.95TB

tier_free_capacity 1.93TB

tier tier_nearline

tier_mdisk_count 0

tier_capacity 0.00MB

tier_free_capacity 0.00MB

compression_active no

compression_virtual_capacity 0.00MB

compression_compressed_capacity 0.00MB

compression_uncompressed_capacity 0.00MB

site_id 1

site_name ITSO_DC1

parent_mdisk_grp_id 0

parent_mdisk_grp_name Pool0_Site1

child_mdisk_grp_count 0

child_mdisk_grp_capacity 0.00MB

type parent

encrypt no

owner_type none

owner_id

owner_name

data_reduction no

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

reclaimable_capacity 0.00MB

Tuning EasyTier

It is also possible to change more advanced parameters of EasyTier. Use these parameters with caution because changing the default values can affect system performance.

EasyTier acceleration

EasyTier acceleration is a system-wide setting that is disabled by default. Turning on this setting makes EasyTier move extents up to four times faster than the default setting. In accelerate mode, EasyTier can move up to 48 GiB per 5 minutes, whereas in normal mode it moves up to 12 GiB. Enabling EasyTier acceleration is advised only during periods of low system activity. The following are the two most probable use cases for acceleration:

•When adding new capacity to the pool, accelerating EasyTier can quickly spread existing volumes onto the new MDisks.

•Migrating the volumes between the storage pools when the target storage pool has more tiers than the source storage pool, so EasyTier can quickly promote or demote extents in the target pool.

This setting can be changed online, without any effect on host or data availability. To turn on or off EasyTier acceleration mode, use the chsystem command, as shown in Example 10-3.

Example 10-3 The chsystem command

IBM_2145:ITSO SVC DH8:superuser>lssystem

id 000002007FC02102

name ITSO SVC DH8

location local

partnership

total_mdisk_capacity 11.7TB

space_in_mdisk_grps 3.9TB

space_allocated_to_vdisks 522.00GB

total_free_space 11.2TB

total_vdiskcopy_capacity 522.00GB

total_used_capacity 522.00GB

total_overallocation 4

total_vdisk_capacity 522.00GB

total_allocated_extent_capacity 525.00GB

statistics_status on

statistics_frequency 15

cluster_locale en_US

time_zone 520 US/Pacific

code_level 8.1.0.0 (build 137.4.1709191910000)

console_IP 10.18.228.64:443

id_alias 000002007FC02102

gm_link_tolerance 300

gm_inter_cluster_delay_simulation 0

gm_intra_cluster_delay_simulation 0

gm_max_host_delay 5

email_reply [email protected]

email_contact no

email_contact_primary 1234567

email_contact_alternate

email_contact_location ff

email_contact2

email_contact2_primary

email_contact2_alternate

email_state stopped

inventory_mail_interval 7

cluster_ntp_IP_address

cluster_isns_IP_address

iscsi_auth_method none

iscsi_chap_secret 1010

auth_service_configured no

auth_service_enabled no

auth_service_url

auth_service_user_name

auth_service_pwd_set no

auth_service_cert_set no

auth_service_type tip

relationship_bandwidth_limit 25

tier tier0_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier1_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_capacity 3.90TB

tier_free_capacity 3.39TB

tier tier_nearline

tier_capacity 0.00MB

tier_free_capacity 0.00MB

easy_tier_acceleration off

has_nas_key no

layer replication

rc_buffer_size 48

compression_active no

compression_virtual_capacity 0.00MB

compression_compressed_capacity 0.00MB

compression_uncompressed_capacity 0.00MB

cache_prefetch on

email_organization ff

email_machine_address ff

email_machine_city ff

email_machine_state XX

email_machine_zip 12345

email_machine_country US

total_drive_raw_capacity 0

compression_destage_mode off

local_fc_port_mask 0000000000000000000000000000000000000000000000000000000000000111

partner_fc_port_mask 0000000000000000000000000000000000000000000000000000000000001000

high_temp_mode off

topology standard

topology_status

rc_auth_method chap

vdisk_protection_time 15

vdisk_protection_enabled no

product_name IBM SAN Volume Controller

odx off

max_replication_delay 0

partnership_exclusion_threshold 315

gen1_compatibility_mode_enabled

ibm_customer

ibm_component

ibm_country

tier0_flash_compressed_data_used 0.00MB

tier1_flash_compressed_data_used 0.00MB

tier_enterprise_compressed_data_used 0.00MB

tier_nearline_compressed_data_used 0.00MB

total_reclaimable_capacity 0.00MB

unmap on

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

IBM_2145:ITSO SVC DH8:superuser>chsystem -easytieracceleration on

IBM_2145:ITSO SVC DH8:superuser>lssystem

id 000002007FC02102

name ITSO SVC DH8

location local

partnership

total_mdisk_capacity 11.7TB

space_in_mdisk_grps 3.9TB

space_allocated_to_vdisks 522.00GB

total_free_space 11.2TB

total_vdiskcopy_capacity 522.00GB

total_used_capacity 522.00GB

total_overallocation 4

total_vdisk_capacity 522.00GB

total_allocated_extent_capacity 525.00GB

statistics_status on

statistics_frequency 15

cluster_locale en_US

time_zone 520 US/Pacific

code_level 8.1.0.0 (build 137.4.1709191910000)

console_IP 10.18.228.64:443

id_alias 000002007FC02102

gm_link_tolerance 300

gm_inter_cluster_delay_simulation 0

gm_intra_cluster_delay_simulation 0

gm_max_host_delay 5

email_reply [email protected]

email_contact no

email_contact_primary 1234567

email_contact_alternate

email_contact_location ff

email_contact2

email_contact2_primary

email_contact2_alternate

email_state stopped

inventory_mail_interval 7

cluster_ntp_IP_address

cluster_isns_IP_address

iscsi_auth_method none

iscsi_chap_secret 1010

auth_service_configured no

auth_service_enabled no

auth_service_url

auth_service_user_name

auth_service_pwd_set no

auth_service_cert_set no

auth_service_type tip

relationship_bandwidth_limit 25

tier tier0_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier1_flash

tier_capacity 0.00MB

tier_free_capacity 0.00MB

tier tier_enterprise

tier_capacity 3.90TB

tier_free_capacity 3.39TB

tier tier_nearline

tier_capacity 0.00MB

tier_free_capacity 0.00MB

easy_tier_acceleration on

has_nas_key no

layer replication

rc_buffer_size 48

compression_active no

compression_virtual_capacity 0.00MB

compression_compressed_capacity 0.00MB

compression_uncompressed_capacity 0.00MB

cache_prefetch on

email_organization ff

email_machine_address ff

email_machine_city ff

email_machine_state XX

email_machine_zip 12345

email_machine_country US

total_drive_raw_capacity 0

compression_destage_mode off

local_fc_port_mask 0000000000000000000000000000000000000000000000000000000000000111

partner_fc_port_mask 0000000000000000000000000000000000000000000000000000000000001000

high_temp_mode off

topology standard

topology_status

rc_auth_method chap

vdisk_protection_time 15

vdisk_protection_enabled no

product_name IBM SAN Volume Controller

odx off

max_replication_delay 0

partnership_exclusion_threshold 315

gen1_compatibility_mode_enabled

ibm_customer

ibm_component

ibm_country

tier0_flash_compressed_data_used 0.00MB

tier1_flash_compressed_data_used 0.00MB

tier_enterprise_compressed_data_used 0.00MB

tier_nearline_compressed_data_used 0.00MB

total_reclaimable_capacity 0.00MB

unmap on

used_capacity_before_reduction 0.00MB

used_capacity_after_reduction 0.00MB

deduplication_capacity_saving 0.00MB

MDisk EasyTier load

The second setting is called MDisk EasyTier load. This setting is set on a per MDisk basis, and indicates how much load EasyTier can put on a particular MDisk. The following values can be set to each MDisk:

•Default

•Low

•Medium

•High

•Very high

The system uses the default setting based on the discovered storage system the MDisk is presented from. Change the default setting to any other value only when you are certain that a particular MDisk is underutilized and can handle more load, or that the MDisk is overutilized and the load should be lowered. Change this setting to very high only for SDD and flash MDisks.

This setting can be changed online, without any effect on the hosts or data availability. To change this setting, use the chmdisk command, as shown in Example 10-4.

Example 10-4 The chmdisk command

IBM_2145:ITSO SVC DH8:superuser>lsmdisk mdisk0

id 0

name mdisk0

status online

mode managed

mdisk_grp_id 0

mdisk_grp_name Pool0_Site1

capacity 500.0GB

quorum_index

block_size 512

controller_name controller0

ctrl_type 4

ctrl_WWNN 50050768020000EF

controller_id 0

path_count 4

max_path_count 4

ctrl_LUN_# 0000000000000000

UID 60050768028a8002680000000000000000000000000000000000000000000000

preferred_WWPN 50050768021000F0

active_WWPN many

fast_write_state empty

raid_status

raid_level

redundancy

strip_size

spare_goal

spare_protection_min

balanced

tier enterprise

slow_write_priority

fabric_type fc

site_id 1

site_name ITSO_DC1

easy_tier_load high

encrypt no

distributed no

drive_class_id

drive_count 0

stripe_width 0

rebuild_areas_total

rebuild_areas_available

rebuild_areas_goal

dedupe no

preferred_iscsi_port_id

active_iscsi_port_id

replacement_date

IBM_2145:ITSO SVC DH8:superuser>chmdisk -easytierload medium mdisk0

IBM_2145:ITSO SVC DH8:superuser>lsmdisk mdisk0

id 0

name mdisk0

status online

mode managed

mdisk_grp_id 0

mdisk_grp_name Pool0_Site1

capacity 500.0GB

quorum_index

block_size 512

controller_name controller0

ctrl_type 4

ctrl_WWNN 50050768020000EF

controller_id 0

path_count 4

max_path_count 4

ctrl_LUN_# 0000000000000000

UID 60050768028a8002680000000000000000000000000000000000000000000000

preferred_WWPN 50050768021000F0

active_WWPN many

fast_write_state empty

raid_status

raid_level

redundancy

strip_size

spare_goal

spare_protection_min

balanced

tier nearline

slow_write_priority

fabric_type fc

site_id 1

site_name ITSO_DC1

easy_tier_load medium

encrypt no

distributed no

drive_class_id

drive_count 0

stripe_width 0

rebuild_areas_total

rebuild_areas_available

rebuild_areas_goal

dedupe no

preferred_iscsi_port_id

active_iscsi_port_id

replacement_date

10.2.9 Monitoring tools

The IBM Storage Tier Advisor Tool (STAT) is an Microsoft Windows application that analyzes heat data files produced by EasyTier. STAT creates a graphical display of the amount of “hot” data per volume, and predicts how adding more Flash drives, enterprise drives, or nearline drives capacity might improve performance of the system by storage pool.

IBM STAT can be downloaded from:

http://www.ibm.com/support/docview.wss?uid=ssg1S4000935

Heat data files are produced approximately once a day (that is, every 24 hours) when EasyTier is active on one or more storage pools. They summarize the activity per volume since the prior heat data file was produced. On the SVC, the heat data file is in the /dumps/easytier directory on the configuration node and named dpa_heat.<node_name>.<time_stamp>.data. Existing heat data file are erased after seven days.

To download a heat data file, open Settings → Support → Support Package and click the twistie to open the Manual Upload Instructions, then click the Download Support Package button, as shown in Figure 10-10.

Figure 10-10 Download EasyTier heat file: Download Support Package

From the Download New Support Package or Log File window, click Download Existing Package, as shown in Figure 10-11.

Figure 10-11 Downloading EasyTier heat data file: Download Existing Package

You can filter for heat files and select the most recent heat data file from the window shown in Figure 10-12. Click Download and save the file wherever you want. In our example, we save the file on our workstation to the STATinput_files directory.

Figure 10-12 Downloading EasyTier heat data file: Filter, select, and download

STAT must be started from a Windows command prompt with the file specified as a parameter, as shown in Example 10-5.

Example 10-5 Running STAT in Windows command prompt

C:Program Files (x86)IBMSTAT>stat input_filesdpa_heat.KD8P1BP.171018.095715.data

You can also specify the output directory if you want. STAT creates a set of Hypertext Markup Language (HTML) files, and the user can then open the STATindex.html file in a browser to view the results. Additionally, three comma-separated values (CSV) files are created and placed in the Data_files directory.

Figure 10-13 shows the csv files highlighted in the Data_files directory after running the STAT tool on the SVC heatmap.

Figure 10-13 CSV files created by STAT for EasyTier

In addition to the STAT tool, another utility is available that is a Microsoft Excel file for creating additional graphical reports of the workload that EasyTier performs. The IBM STAT Charting Utility takes the previous three CSV output files and turns them into graphs for simple reporting.

The STAT Charting Utility can be downloaded from the IBM Support website:

http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5251

The graphs display the following information:

•Workload Categorization report

Workload visuals help you compare activity across tiers within and across pools to help determine the optimal drive mix for the current workloads. The output is illustrated in Figure 10-14.

Figure 10-14 STAT Charting Utility Workload Categorization report

•Daily Movement report

An EasyTier summary report every 24 hours illustrating data migration activity (in 5-minute intervals) can help visualize migration types and patterns for current workloads. The output is illustrated in Figure 10-15.

Figure 10-15 STAT Charting Utility Daily Summary report

•Workload Skew report

This report shows the skew of all workloads across the system in a graph to help you visualize and accurately tier configurations when you add capacity or a new system. The output is illustrated in Figure 10-16.

Figure 10-16 STAT Charting Utility Workload Skew report

10.2.10 More information

For more information about planning and configuration considerations, best practices, and monitoring and measurement tools, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521, and Implementing IBM Easy Tier with IBM Real-time Compression, TIPS1072.

10.3 Thin provisioning

In a shared storage environment, thin provisioning is a method for optimizing the usage of available storage. It relies on the allocation of blocks of data on demand versus the traditional method of allocating all of the blocks up front. This method eliminates almost all white space, which helps avoid the poor usage rates (often as low as 10%) that occur in the traditional storage allocation method where large pools of storage capacity are allocated to individual servers but remain unused (not written to).

Thin provisioning presents more storage space to the hosts or servers that are connected to the storage system than is available on the storage system. The IBM SVC has this capability for FC and Internet Small Computer System Interface (iSCSI) provisioned volumes.

An example of thin provisioning is when a storage system contains 5000 GiB of usable storage capacity, but the storage administrator mapped volumes of 500 GiB each to 15 hosts. In this example, the storage administrator makes 7500 GiB of storage space visible to the hosts, even though the storage system has only 5000 GiB of usable space, as shown in Figure 10-17.

In this case, all 15 hosts cannot immediately use all 500 GiB that are provisioned to them. The storage administrator must monitor the system and add storage, as needed.

Figure 10-17 Concept of thin provisioning

You can imagine thin provisioning as the same process as when airlines sell more tickets on a flight than physical seats are available, assuming that some passengers do not appear at check-in. They do not assign actual seats at the time of sale, which avoids each client having a claim on a specific seat number. The same concept applies to thin provisioning (the airline), SVC (the plane), and its volumes (seats). The storage administrator (airline ticketing system) must closely monitor the allocation process and set proper thresholds.

10.3.1 Configuring a thin-provisioned volume

Volumes can be configured as thin-provisioned or fully allocated. Thin-provisioned volumes are created with real and virtual capacities. You can still create volumes by using a striped, sequential, or image mode virtualization policy, as you can with any other volume.

Real capacity defines how much disk space is allocated to a volume. Virtual capacity is the capacity of the volume that is reported to other SVC components (such as FlashCopy or remote copy) and to the hosts. For example, you can create a volume with a real capacity of only 100 GiB but a virtual capacity of 1 tebibyte (TiB). The actual space that is used by the volume on the SVC is 100 GiB, but hosts see a 1 TiB volume.

A directory maps the virtual address space to the real address space. The directory and the user data share the real capacity.

Thin-provisioned volumes are available in two operating modes: Autoexpand and non-autoexpand. You can switch the mode at any time. If you select the autoexpand feature, the SVC automatically adds a fixed amount of more real capacity to the thin volume as required. Therefore, the autoexpand feature attempts to maintain a fixed amount of unused real capacity for the volume. This amount is known as the contingency capacity.

The contingency capacity is initially set to the real capacity that is assigned when the volume is created. If the user modifies the real capacity, the contingency capacity is reset to be the difference between the used capacity and real capacity.

A volume that is created without the autoexpand feature, and therefore has a zero contingency capacity, goes offline when the real capacity is used up (reached) and the volume must expand.

Warning threshold: Enable the warning threshold by using email, such as Simple Mail Transfer Protocol (SMTP), or a Simple Network Management Protocol (SNMP) trap, when you work with thin-provisioned volumes. You can enable the warning threshold on the volume, and on the storage pool side, especially when you do not use the autoexpand mode. Otherwise, the thin volume goes offline if it runs out of space.

Autoexpand mode does not cause real capacity to grow much beyond the virtual capacity. The real capacity can be manually expanded to more than the maximum that is required by the current virtual capacity, and the contingency capacity is recalculated.

A thin-provisioned volume can be converted non-disruptively to a fully allocated volume, or vice versa, by using the volume mirroring function. For example, you can add a thin-provisioned copy to a fully allocated primary volume and then remove the fully allocated copy from the volume after they are synchronized.

The fully allocated to thin-provisioned migration procedure uses a zero-detection algorithm so that grains that contain all zeros do not cause any real capacity to be used. Usually, if the SVC is to detect zeros on the volume, you must use software on the host side to write zeros to all unused space on the disk or file system.

Tip: Consider the use of thin-provisioned volumes as targets in FlashCopy mappings.

Space allocation

When a thin-provisioned volume is created, a small amount of the real capacity is used for initial metadata. Write I/Os to the grains of the thin volume (that were not previously written to) cause grains of the real capacity to be used to store metadata and user data. Write I/Os to the grains (that were previously written to) update the grain where data was previously written.

The grain is defined when the volume is created, and can be 32 KiB, 64 KiB, 128 KiB, or 256 KiB.

Smaller granularities can save more space, but they have larger directories. When you use thin-provisioning with FlashCopy, specify the same grain size for the thin-provisioned volume and FlashCopy.

For information about creating a thin-provisioned volume, see Chapter 7, “Volumes” on page 251.

10.3.2 Performance considerations

Thin-provisioned volumes save capacity only if the host server does not write to whole volumes. Whether the thin-provisioned volume works well depends partly on how the file system allocated the space.

Some file systems, such as New Technology File System (NTFS), write to the whole volume before overwriting deleted files. Other file systems reuse space in preference to allocating new space. File system problems can be moderated by tools, such as defrag, or by managing storage by using host Logical Volume Managers (LVMs).

The thin-provisioned volume also depends on how applications use the file system. For example, some applications delete log files only when the file system is nearly full.

Important: Do not use defragmentation applications on thin-provisioned volumes. The defragmentation process can write data to different areas of a volume, which can cause a thin-provisioned volume to grow up to its virtual size.

There is no performance recommendation for thin-provisioned volumes. As explained previously, the performance of thin-provisioned volumes depends on what is used in the particular environment. For the best performance, use fully allocated volumes rather than thin-provisioned volumes.

Starting with V7.3, the cache subsystem architecture was redesigned. Now, thin-provisioned volumes can benefit from lower cache functions (such as coalescing writes or prefetching), which greatly improve performance.

10.3.3 Limitations of virtual capacity

A few factors (extent and grain size) limit the virtual capacity of thin-provisioned volumes beyond the factors that limit the capacity of regular volumes. Table 10-3 shows the maximum thin-provisioned volume virtual capacities for an extent size.

Table 10-3 Maximum thin-provisioned volume virtual capacities for an extent size

Extent size (in MiB)	Maximum volume real capacity (in GiB)	Maximum thin-provisioned volume virtual capacity (in GiB)
16	2,048	2,000
32	4,096	4,000
64	8,192	8,000
128	16,384	16,000
256	32,768	32,000
512	65,536	65,000
1,024	131,072	130,000
2,048	262,144	260,000
4,096	262,144	262,144
8,192	262,144	262,144

Table 10-4 shows the maximum thin-provisioned volume virtual capacities for a grain size.

Table 10-4 Maximum thin-provisioned volume virtual capacities for a grain size

Grain size (in KiB)	Maximum thin-provisioned volume virtual capacity (in GiB)
32	260,000
64	520,000
128	1,040,000
256	2,080,000

For more information and detailed performance considerations for configuring thin provisioning, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521. You can also go to IBM Knowledge Center:

https://ibm.biz/BdjGMT

10.4 Unmap

There is an industry trend toward host operating systems having more control over the storage that it is using. VMWare's VAAI/VASA/VVOLs and Microsoft's ODX are examples of such technologies. These technologies allow host operating systems to manage data (for example, provision, move data around) on the controller without a storage administrator needing to do anything on the storage controller. This change also means that a user on the host operating system does not need to know anything about the underlying technologies.

10.4.1 SCSI unmap command

Unmap is a set of SCSI primitives that allow hosts to indicate to a SCSI target that space allocated to a range of blocks on a target storage volume is no longer required. This command allows the storage controller to take measures and optimize the system so that the space can be reused for other purposes. The most common use case, for example, is a host application such as VMware freeing storage within a file system. The storage controller can then optimize the space, such as reorganizing the data on the volume so that space is better used.

When a host allocates storage, the data is placed in a volume. To free the allocated space back to the storage pools, human intervention is needed on the storage controller. The SCSI Unmap feature is used to allow host operating systems to unprovision storage on the storage controller, which means that the resources can automatically be freed up in the storage pools and used for other purposes.

A SCSI unmappable volume is a volume that can have storage unprovision and space reclamation being triggered by the host operating system. With the release of V8.1 code, the SCSI unmap command is passed through to back-end storage controllers that support the function.

Note: Some host types will respond to this by issuing WRITE SAME UNMAP commands, generating large amounts of I/O. Offload throttling must be enabled before upgrading to V8.1 to prevent this extra workload from overloading MDisks. For more information, see:

https://www.ibm.com/support/docview.wss?uid=ssg1S1010697

10.5 Real-time Compression

The IBM Real-time Compression Software that is embedded in IBM Spectrum Virtualize addresses the requirements of primary storage data reduction, including performance. It does so by using a purpose-built technology called Real-time Compression that uses the Random Access Compression Engine (RACE). It offers the following benefits:

•Compression for active primary data

IBM Real-time Compression can be used with active primary data. Therefore, it supports workloads that are not candidates for compression in other solutions. The solution supports compression of existing data online. Storage administrators can regain free disk space in an existing storage system without requiring administrators and users to clean up or archive data.

This configuration significantly enhances the value of existing storage assets, and the benefits to the business are immediate. The capital expense of upgrading or expanding the storage system is delayed.

•Compression for replicated or mirrored data

Remote volume copies can be compressed in addition to the volumes at the primary storage tier. This process reduces storage requirements in Metro Mirror and Global Mirror destination volumes, as well.

•No changes to the existing environment required

IBM Real-time Compression is part of the storage system. It was designed with the goal of transparency so that it can be implemented without changes to applications, hosts, networks, fabrics, or external storage systems. The solution is not apparent to hosts, so users and applications continue to work as-is. Compression occurs within the SVC system.

•Overall savings in operational expenses

More data is stored in a rack space, so fewer storage expansion enclosures are required to store a data set. This reduced rack space has the following benefits:

– Reduced power and cooling requirements. More data is stored in a system, which requires less power and cooling per gigabyte or used capacity.

– Reduced software licensing for more functions in the system. More data that is stored per enclosure reduces the overall spending on licensing.

Tip: Implementing compression in IBM Spectrum Virtualize provides the same benefits to internal Flash drives and externally virtualized storage systems.

•Disk space savings are immediate

The space reduction occurs when the host writes the data. This process is unlike other compression solutions, in which some or all of the reduction is realized only after a post-process compression batch job is run.

Demonstration: The IBM Client Demonstration Center shows how easy it is to reduce your data footprint in the Data Reduction: reduce easily data footprint on your existing “IBM Spectrum Virtualize based” storage with IBM Real-Time Compression demo available at:

https://ibm.biz/Bdjhzx

10.5.1 Common use cases

This section addresses the most common use cases for implementing compression:

•General-purpose volumes

•Databases

•Virtualized infrastructures

•Log server data stores

General-purpose volumes

Most general-purpose volumes are used for highly compressible data types, such as home directories, CAD/CAM, oil and gas geo-seismic data, and log data. Storing such types of data in compressed volumes provides immediate capacity reduction to the overall used space. More space can be provided to users without any change to the environment.

Many file types can be stored in general-purpose servers. However, for practical information, the estimated compression ratios are based on actual field experience. Expected compression ratios are 50% - 60%.

File systems that contain audio, video files, and compressed files are not good candidates for compression. The overall capacity savings on these file types are minimal.

Databases

Database information is stored in table space files. High compression ratios are common in database volumes. Examples of databases that can greatly benefit from RtC are IBM DB2, Oracle, and Microsoft SQL Server. Expected compression ratios are 50% - 80%.

Important: Certain databases offer optional built-in compression. Do not compress already compressed database files.

Virtualized infrastructures

The proliferation of open systems virtualization in the market has increased the use of storage space, with more virtual server images and backups kept online. The use of compression reduces the storage requirements at the source.

Examples of virtualization solutions that can greatly benefit from RtC are VMware, Microsoft Hyper-V, and kernel-based virtual machine (KVM). Expected compression ratios are 45% - 75%.

Tip: Virtual machines (VMs) with file systems that contain compressed files are not good compression candidates, as described in “Databases”.

Log server data stores

Logs are a critical part for any information technology (IT) department in any organization. Log aggregates or syslog servers are a central point for the administrators, and immediate access and a smooth work process are necessary. Log server data stores are good candidates for Real-time Compression. Expected compression ratios are up to 90%.

10.5.2 Real-time Compression concepts

The RACE technology is based on over 50 patents that are not primarily about compression. Instead, they define how to make industry-standard Lempel-Ziv (LZ) compression of primary storage operate in real time and allow random access. The primary intellectual property behind this technology is the RACE component.

At a high level, the IBM RACE component compresses data that is written into the storage system dynamically. This compression occurs transparently, so Fibre Channel and iSCSI connected hosts are not aware of the compression. RACE is an inline compression technology, which means that each host write is compressed as it passes through IBM Spectrum Virtualize to the disks. This technology has a clear benefit over other compression technologies that are post-processing based.

These technologies do not provide immediate capacity savings. Therefore, they are not a good fit for primary storage workloads, such as databases and active data set applications.

RACE is based on the Lempel-Ziv lossless data compression algorithm and operates using a real-time method. When a host sends a write request, the request is acknowledged by the write cache of the system, and then staged to the storage pool. As part of its staging, the write request passes through the compression engine and is then stored in compressed format onto the storage pool. Therefore, writes are acknowledged immediately after they are received by the write cache with compression occurring as part of the staging to internal or external physical storage.

Capacity is saved when the data is written by the host because the host writes are smaller when they are written to the storage pool. IBM RtC is a self-tuning solution that is similar to the SVC system. It is adapting to the workload that runs on the system at any particular moment.

10.5.3 Random Access Compression Engine

To understand why RACE is unique, you need to review the traditional compression techniques. This description is not about the compression algorithm itself (how the data structure is reduced in size mathematically). Rather, the description is about how the data is laid out within the resulting compressed output.

Compression utilities

Compression is probably most known to users because of the widespread use of compression utilities. At a high level, these utilities take a file as their input, and parse the data by using a sliding window technique. Repetitions of data are detected within the sliding window history, most often 32 KiB. Repetitions outside of the window cannot be referenced. Therefore, the file cannot be reduced in size unless data is repeated when the window “slides” to the next 32 KiB slot.

Figure 10-18 shows compression that uses a sliding window, where the first two repetitions of the string ABCD fall within the same compression window, and can therefore be compressed by using the same dictionary. The third repetition of the string falls outside of this window, and therefore cannot be compressed by using the same compression dictionary as the first two repetitions, reducing the overall achieved compression ratio.

Figure 10-18 Compression that uses a sliding window

Traditional data compression in storage systems

The traditional approach that is taken to implement data compression in storage systems is an extension of how compression works in the compression utilities previously mentioned. Similar to compression utilities, the incoming data is broken into fixed chunks, and then each chunk is compressed and extracted independently.

However, drawbacks exist to this approach. An update to a chunk requires a read of the chunk followed by a recompression of the chunk to include the update. The larger the chunk size chosen, the heavier the I/O penalty to recompress the chunk. If a small chunk size is chosen, the compression ratio is reduced because the repetition detection potential is reduced.

Figure 10-19 shows an example of how the data is broken into fixed-size chunks (in the upper-left corner of the figure). It also shows how each chunk gets compressed independently into variable length compressed chunks (in the upper-right side of the figure). The resulting compressed chunks are stored sequentially in the compressed output.

Although this approach is an evolution from compression utilities, it is limited to low-performance use cases mainly because this approach does not provide real random access to the data.

Figure 10-19 Traditional data compression in storage systems

Random Access Compression Engine

The IBM patented Random Access Compression Engine implements an inverted approach when compared to traditional approaches to compression. RACE uses variable-size chunks for the input, and produces fixed-size chunks for the output.

This method enables an efficient and consistent method to index the compressed data because the data is stored in fixed-size containers (Figure 10-20).

Figure 10-20 Random Access Compression

Location-based compression

Both compression utilities and traditional storage systems compression compress data by finding repetitions of bytes within the chunk that is being compressed. The compression ratio of this chunk depends on how many repetitions can be detected within the chunk. The number of repetitions is affected by how much the bytes stored in the chunk are related to each other. Furthermore, the relationship between bytes is driven by the format of the object. For example, an office document might contain textual information, and an embedded drawing, such as this page.

Because the chunking of the file is arbitrary, it has no notion of how the data is laid out within the document. Therefore, a compressed chunk can be a mixture of the textual information and part of the drawing. This process yields a lower compression ratio because the different data types mixed together cause a suboptimal dictionary of repetitions. That is, fewer repetitions can be detected because a repetition of bytes in a text object is unlikely to be found in a drawing.

This traditional approach to data compression is also called location-based compression. The data repetition detection is based on the location of data within the same chunk.

This challenge was addressed with the predecide mechanism introduced from V7.1.

Predecide mechanism

Certain data chunks have a higher compression ratio than others. Compressing some of the chunks saves little space but still requires resources, such as processor (CPU) and memory. To avoid spending resources on uncompressible data and to provide the ability to use a different, more effective (in this particular case) compression algorithm, IBM invented a predecide mechanism that was introduced in V7.1.

The chunks that are below a certain compression ratio are skipped by the compression engine, saving CPU time and memory processing. These chunks are not compressed with the main compression algorithm, but that can still be compressed with another algorithm. They are marked and processed. The results can vary because predecide does not check the entire block, but only a sample of it.

Figure 10-21 shows how the detection mechanism works.

Figure 10-21 Detection mechanism

Temporal compression

RACE offers a technology leap beyond location-based compression, called temporal compression. When host writes arrive at RACE, they are compressed and fill fixed-size chunks that are also called compressed blocks. Multiple compressed writes can be aggregated into a single compressed block. A dictionary of the detected repetitions is stored within the compressed block.

When applications write new data or update existing data, the data is typically sent from the host to the storage system as a series of writes. Because these writes are likely to originate from the same application and be from the same data type, more repetitions are usually detected by the compression algorithm. This type of data compression is called temporal compression because the data repetition detection is based on the time that the data was written into the same compressed block.

Temporal compression adds the time dimension that is not available to other compression algorithms. It offers a higher compression ratio because the compressed data in a block represents a more homogeneous set of input data.

Figure 10-22 shows how three writes sent one after the other by a host end up in different chunks. They get compressed in different chunks because their location in the volume is not adjacent. This approach yields a lower compression ratio because the same data must be compressed non-natively by using three separate dictionaries.

Figure 10-22 Location-based compression

When the same three writes are sent through RACE, as shown on Figure 10-23, the writes are compressed together by using a single dictionary. This approach yields a higher compression ratio than location-based compression.

Figure 10-23 Temporal compression

10.5.4 Dual RACE instances

In V7.4, the compression code was enhanced by the addition of a second RACE instance per SVC node. This feature takes advantage of multi-core processor architecture, and uses the compression accelerator cards more effectively. The second RACE instance works in parallel with the first instance, as shown in Figure 10-24.

Figure 10-24 Dual RACE architecture

With dual RACE enhancement, the compression performance can be boosted up to two times for compressed workloads when compared to previous versions.

To take advantage of dual RACE, several software and hardware requirements must be met:

•The software must be at or above V7.4.

•Only 2145-DH8 and 2145-SV1 nodes are supported.

•A second eight-core processor must be installed per SVC 2145-DH8 node.

•An additional 32 gigabytes (GB) must be installed per SVC 2145-DH8 node.

•At least one compression accelerator card must be installed per SVC 2145-DH8 node. The second acceleration card is not required.

•2145- SV1 configurations with the optional cache upgrade feature, expanding the total system memory to 256 GB. Compression workloads can also benefit from the hardware-assisted acceleration offered by the addition of up to two compression accelerator cards.

Tip: Use two compression accelerator cards for the best performance.

When using the dual RACE feature, the acceleration cards are shared between RACE instances, which means that the acceleration cards are used simultaneously by both RACE instances. The rest of the resources, such as CPU cores and random access memory (RAM), are evenly divided between the RACE components.

You do not need to manually enable dual RACE. Dual RACE runs automatically when all minimal software and hardware requirements are met. If the SVC is compression capable but the minimal requirements for dual RACE are not met, only one RACE instance is used (as with earlier versions of code).

10.5.5 Random Access Compression Engine in IBM Spectrum Virtualize stack

It is important to understand where the RACE technology is implemented in the IBM Spectrum Virtualize software stack. This location determines how it applies to other SVC components.

RACE technology is implemented into the IBM Spectrum Virtualize thin provisioning layer, and is an organic part of the stack. The IBM Spectrum Virtualize software stack is shown in Figure 10-25. Compression is transparently integrated with existing system management design. All of the IBM Spectrum Virtualize advanced features are supported on compressed volumes. You can create, delete, migrate, map (assign), and unmap (unassign) a compressed volume as though it were a fully allocated volume.

In addition, you can use Real-time Compression with EasyTier on the same volumes. This compression method provides nondisruptive conversion between compressed and decompressed volumes. This conversion provides a uniform user experience, and eliminates the need for special procedures when dealing with compressed volumes.

Figure 10-25 RACE integration within IBM Spectrum Virtualize stack

10.5.6 Data write flow

When a host sends a write request to the SVC, it reaches the upper-cache layer. The host is immediately sent an acknowledgment of its I/Os. When the upper cache layer destages to the RACE, the I/Os are sent to the thin-provisioning layer. They are then sent to the RACE, and if necessary, to the original host write or writes. The metadata that holds the index of the compressed volume is updated, if needed, and compressed, as well.

10.5.7 Data read flow

When a host sends a read request to the SVC for compressed data, the request is forwarded directly to the Real-time Compression component:

•If the Real-time Compression component contains the requested data, the SVC cache replies to the host with the requested data without having to read the data from the lower-level cache or disk.

•If the Real-time Compression component does not contain the requested data, the request is forwarded to the SVC lower-level cache.

•If the lower-level cache contains the requested data, it is sent up the stack and returned to the host without accessing the storage.

•If the lower-level cache does not contain the requested data, it sends a read request to the storage for the requested data.

10.5.8 Compression of existing data

In addition to compressing data in real time, you can also compress existing data sets (convert volume to compressed). To do so, you must change the capacity savings settings of the volume by completing these steps:

1. Right-click a volume and select Modify Capacity Settings, as shown in Figure 10-26.

Figure 10-26 Modifying Capacity Settings

2. In the menu, select Compression as the Capacity Savings option, as shown in Figure 10-27.

Figure 10-27 Selecting Capacity Setting

After the copies are fully synchronized, the original volume copy is deleted automatically.

As a result, you have compressed data on the existing volume. This process is nondisruptive, so the data remains online and accessible by applications and users.

This capability enables clients to regain space from the storage pool, which can then be reused for other applications.

With the virtualization of external storage systems, the ability to compress already stored data significantly enhances and accelerates the benefit to users. This capability enables them to see a tremendous return on their SVC investment. On the initial purchase of an SVC with Real-time Compression, clients can defer their purchase of new storage. When new storage needs to be acquired, IT purchases a lower amount of the required storage before compression.

10.5.9 Data reduction: Pattern removal, data deduplication, and compression

Deduplication can be implemented on the SVC by attaching an IBM FlashSystem A9000 as external storage. Although the SVC does not currently provide deduplication natively, by creating a storage pool with managed disks from the IBM FlashSystem A9000, deduplication is easily provided. This section describes data reduction on the IBM FlashSystem A9000 and how these managed disks are used on the SVC.

IBM FlashSystem A9000 and IBM FlashSystem A9000R

IBM FlashSystem A9000 and IBM FlashSystem A9000R use industry-leading data reduction technology that combines inline, real-time pattern matching and removal, data deduplication, and compression. Compression also uses hardware cards inside each grid controller. Compression can easily provide a 2:1 data reduction saving rate on its own, effectively doubling the system storage capacity. Combined with pattern removal and data deduplication services, IBM FlashSystem A9000 and IBM FlashSystem A9000R can easily yield an effective data capacity of five times the original usable physical capacity.

Using IBM FlashSystem A9000 or IBM FlashSystem A9000R with IBM FlashSystem V9000

The SVC uses the IBM FlashSystem A9000 as external storage. This goal is accomplished by the steps described in the following sections.

IBM FlashSystem A9000 tasks

Perform these tasks:

1. Zone the IBM FlashSystem A9000 or IBM FlashSystem A9000R with the SVC.

2. Define the SVC as a host on the IBM FlashSystem A9000.

3. Map volumes on the IBM FlashSystem A9000 to this SVC host.

SVC tasks

Perform these tasks:

1. Discover storage on the SVC. The volumes appear as managed disks.

2. Assign these managed disks to a pool containing only storage from this IBM FlashSystem A9000 device.

3. Allocate volumes from this pool to SVC hosts that are good deduplication targets.

With SVC, using IBM FlashSystem A9000 deduplication technology is simple. Figure 10-28 shows that the Deduplication attribute of the managed disk is Active.

Figure 10-28 SVC managed disks from IBM FlashSystem A9000R

Deduplication status is important because it also allows SVC to recognize and enforce restrictions:

•Storage pools with deduplication MDisks should only contain MDisks from the same IBM FlashSystem A9000 or IBM FlashSystem A9000R storage controller.

•Deduplication MDisks cannot be mixed in an Easy Tier enabled storage pool.

Note: Currently, the SVC does allow you to create compressed volumes in a storage pool with deduplication. This capability provides no benefit because the IBM FlashSystem A9000 cannot deduplicate or compress data that is already compressed. A good practice is to allow the IBM FlashSystem A9000 to perform the deduplication and compression.

10.5.10 Creating compressed volumes

Licensing is required to use compression on the SVC. With the SVC, RtC is licensed by capacity, per terabyte of virtual data.

For information about creating a compressed volume, see Chapter 7, “Volumes” on page 251.

10.5.11 Comprestimator

The Comprestimator utility to estimate expected compression ratios on existing volumes has been built in to IBM Spectrum Virtualize since V7.6.

The built-in Comprestimator is a command-line function that analyzes an existing volume and provides output showing an estimate of expected compression rate.

Comprestimator uses advanced mathematical and statistical algorithms to perform the sampling and analysis process in a short and efficient way of online volumes. The utility also displays its accuracy level by showing the maximum error range of the results achieved based on the formulas it uses.

The following list describes the available commands:

•analyzevdisk provides an option to analyze a single volume.

Usage: analyzevdisk <volume ID>
Example: analyzevdisk 0

This command can be canceled by running the analyzevdisk <volume ID> -cancel command.

•The lsvdiskanalysis command provides a list and the status of one or more volumes. Some of them can be analyzed already, some of them not yet. The command can either be used for all volumes on the system, or it can be used per volume, similar to lsvdisk, as shown in Example 10-6.

Example 10-6 An example of the command that ran over one volume ID 0

IBM_2145:ITSO_SVC_DH8:superuser>lsvdiskanalysis 0

id 0

name SQL_Data0

state estimated

started_time 151012104343

analysis_time 151012104353

capacity 300.00GB

thin_size 290.85GB

thin_savings 9.15GB

thin_savings_ratio 3.05

compressed_size 141.58GB

compression_savings 149.26GB

compression_savings_ratio 51.32

total_savings 158.42GB

total_savings_ratio 52.80

accuracy 4.97

In this command, state can have one of the following values:

– idle. Was never estimated and not currently scheduled.

– scheduled. Volume is queued for estimation, and will be processed based on lowest volume ID first.

– active. Volume is being analyzed.

– canceling. Volume was requested to end an active analysis, and the analysis is not yet canceled.

– estimated. Volume was analyzed and results show the expected savings of thin provisioning and compression.

– sparse. Volume was analyzed but Comprestimator could not find enough nonzero samples to establish a good estimation.

The compression_savings_ratio is the estimated amount of space that can be saved on the storage in the frame of this specific volume, expressed as a percentage.

•The analyzevdiskbysystem command provides an option to run Comprestimator on all volumes within the system. The analyzing process is nondisruptive, and should not affect the system significantly. Analysis speed can vary depending on the fullness of the volume, but should not take more than a few minutes per volume.

This process can be canceled by running the analyzevdiskbysystem -cancel command.

•The lsvdiskanalysisprogress command shows the progress of the Comprestimator analysis, as shown in Example 10-7.

Example 10-7 Comprestimator progress

id vdisk_count pending_analysis estimated_completion_time

0 45 12 151012154400

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10. Advanced features for storage efficiency

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 10. Advanced features for storage efficiency