Chapter 1. Introduction to using IBM Storwize with PowerHA solutions on IBM i

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Introduction to using IBM Storwize with PowerHA solutions on IBM i

This chapter discusses the various IBM PowerHA SystemMirror for i solution options that are available by using the IBM Storwize family of products.

The following topics are described in this chapter:

•1.1, “Value proposition” on page 2

•1.2, “Prerequisites” on page 3

•1.3, “LUN-level switching” on page 6

•1.4, “Storwize family Copy Services” on page 8

For more information about Storwize replication options, see IBM System Storage SAN Volume Controller and Storwize V7000 Replication Family Services, SG24-7574.

1.1 Value proposition

Increasing business demands for application availability require more clients of any size to look for a solution that can help eliminate planned and unplanned downtimes for their IT services.

An unplanned outage can have severe implications if the duration of the outage or recovery time exceeds business expectations. These implications include unexpected loss of reputation, client loyalty, and revenue. Companies who did not effectively plan for the risk of an unplanned outage, never fully completed their installation of a high-availability (HA) solution, or did not have a tested recovery plan in place, are especially exposed to negative business effects.

The IBM PowerHA SystemMirror for i solution offers a complete end-to-end integrated clustering solution for HA and disaster recovery (DR). PowerHA provides a data and application resiliency solution that is an integrated extension of the IBM i operating system and storage management architecture. It also features the design objective of providing application HA through planned and unplanned outages.

Built with IBM Spectrum™ Virtualize software, the IBM Storwize family provides hybrid solutions with common functions, management, and flexibility. It includes built-in functions such as IBM Real-time Compression™ and IBM Easy Tier® technology optimizing flash and hard disk drives (HDDs), and Remote Copy and IBM FlashCopy® functions to deliver extraordinary levels of efficiency and high performance. Available in a wide range of storage systems, the Storwize family delivers sophisticated capabilities that are easy to deploy, and help control costs for growing businesses.

Storwize family systems come in a range of offerings to meet the differing needs of organizations, but they are all built on a common platform. Shared technologies and common management features mean that the Storwize family is the correct system to choose today for use tomorrow, as your storage requirements grow.

Designed to deliver the benefits of enterprise-class storage virtualization to large and small organizations alike, IBM System Storage® SAN Volume Controller provides a single point of control to support improved application availability and greater resource utilization.

Beginning with entry-level storage systems and extending through midrange block and unified storage systems, virtualization and enterprise-level systems, the Storwize family delivers innovative built-in capabilities that are ready to use from day one:

•Entry storage system

IBM Storwize V3700 is an easy-to-use, efficient, and affordable entry storage system that is designed to address the growing data requirements and infrastructure consolidation needs of small and midsize businesses with sophisticated capabilities that are unusual for a system of this class.

•Midrange, highly flexible storage system

IBM Storwize V5000 is a highly flexible, easy to use, and virtualized storage system that enables midsize organizations to overcome their storage challenges with advanced functions.

•Midrange, block and unified storage systems

IBM Storwize V7000 and IBM Storwize V7000 Unified are highly scalable midrange, virtualized storage systems that are designed to consolidate workloads into a single system for simplicity of management, reduced cost, superb performance, and HA.

•Storage virtualization system

SAN Volume Controller is a leading-edge storage virtualization system that enhances existing storage to help improve productivity and availability while it helps reduce cost.

For more information about the IBM Storwize family, see the IBM Storwize family website:

http://www.ibm.com/systems/storage/storwize/

1.2 Prerequisites

This section describes the prerequisites that are needed to implement an IBM PowerHA SystemMirror for i solution together with IBM Storwize. Before installing the IBM PowerHA SystemMirror for i licensed product (5770-HAS), check whether the following requirements are in place:

•IBM i 7.2 is installed on all system nodes (servers or LPARs) that are part of your HA or disaster-recovery solution.

•HA Switchable Resources (option 41 of 5770-SS1) is installed on all system nodes (server or LPARs) that are part of your HA or disaster-recovery solution.

Note: Licensing for HA Switchable Resources is included with 5770-HAS PowerHA SystemMirror for i 7.2. You do not have to order it separately, but must be installed separately.

•Portable Application Solution Environment (5770-SS1 option 33) is installed on all system nodes.

•IBM Portable Utilities for i and OpenSSH, OpenSSL, zlib (5733-SC1 base and option 1) is installed on all system nodes.

There are three editions of PowerHA SystemMirror for i:

•Express Edition. The Express Edition is intended to be the foundation for a class of High Availability / Disaster Recovery (HA/DR) offerings that are based on restarting (IPL) the logical partition (LPAR) onto a backup system for HA operations.

•Standard Edition. The Standard Edition, which is 5770-HAS option 2, is targeted at a data center HA solution.

•Enterprise Edition. The Enterprise Edition, which is 5770-HAS option 1, adds support for a multi-site HA and DR solution. Standard Edition must also be installed along with this option.

Table 1-1 provides an overview of the functions that are available with the different editions of IBM PowerHA SystemMirror for i.

Table 1-1 IBM PowerHA SystemMirror for i editions

IBM i HA/DR clustering	Express Edition	Standard Edition	Enterprise Edition
Centralized cluster management			
Cluster resource management			
Centralized cluster configuration			
Automated cluster validation			
Cluster admin domain			
Cluster device domain			
Integrated heartbeat			
Application monitoring			
IBM i event/error management			
Automated planned failover			
Managed unplanned failover			
Centralized FlashCopy			
LUN-level switching			
Geomirror Sync mode			
Geomirror Async mode			
Multi-Site HA/DR management			
Metro Mirror			
Global Mirror			
IBM HyperSwap®			

1.2.1 Preparing for an SSH connection between IBM i and Storwize system

Communication between PowerHA and the Storwize system is done by using SSH. You must create SSH key pairs and attach the SSH public key to a user on the Storwize system. The corresponding private key file is specified in the creation of the auxiliary storage pool (ASP) copy descriptions. The private key must be distributed to all nodes in the cluster and stored in the same directory on all nodes. Generation of the SSH key pair is done on IBM i from QSHELL.

Example 1-1 on page 5 shows the QSHELL commands that you use to generate the SSH keys and shows a list of the files that are created. The directory that is used in this example to store the SSH key does not exist on the system but was created before.

Example 1-1 Generate SSH keys on IBM i

> cd /QIBM/UserData/HASM/hads/.ssh/

> ssh-keygen -t rsa -f id_rsa -N ''

Generating public/private rsa key pair.

Your identification has been saved in id_rsa.

Your public key has been saved in id_rsa.pub.

The key fingerprint is:

76:49:e1:9e:04:0a:c5:e2:68:3a:d0:6b:0b:4b:e2:2e [email protected]

> ls -la

total 64

drwxrwsrwx 2 powerha 0 8192 Sep 21 17:02 .

drwx--Sr-x 3 qsys 0 8192 Sep 21 15:19 ..

-rw------- 1 powerha 0 1679 Sep 21 17:02 id_rsa

-rw-r--r-- 1 powerha 0 414 Sep 21 17:02 id_rsa.pub

Note: The SSH key pair that is generated here and used by PowerHA is in OpenSSH key format. It cannot be used by PuTTY, as PuTTY expects SSH2 keys.

After generating the SSH key pair, you must import the id_rsa.pub file as a key into a user on the Storwize system. This user must have a minimum role of Copy Operator to perform the functions that are used by PowerHA. If you want to use LUN-level switching, make sure that the user has a minimum role of Administrator because PowerHA must change host attachments when switching the independent auxiliary storage pools (IASP) to the secondary system.

Figure 1-1 shows transferring the id_rsa.pub file to the PC and importing it to the Storwize user. Make sure to distribute the id_rsa file to all nodes in your cluster to the same directory. The user profile QHAUSRPRF must have at least *R data authority to the key file on each system.

Figure 1-1 Import the SSH public key to a Storwize user

1.2.2 Initializing IBM i disk units on the backup nodes

Before setting up Copy Services on Storwize, you must initialize and format the read-protected DPHxxx disk units to become usable for the IASP on the IBM i backup nodes.

This task can be done in System Service Tools (SST) by choosing option 3 (Working with disk units), then selecting option 3 (Work with disk unit recovery), then selecting option 2 (Disk unit problem recovery procedure), and then selecting option 1 (Initialize and format disk unit). Failing to do so can result in IASP disk units not showing up after a switchover or failover to the secondary system.

1.3 LUN-level switching

LUN-level switching, as shown in Figure 1-2 on page 7, supports an automated IBM i HA solution by using a single copy of an IASP group that can be switched between two IBM i cluster nodes for a local HA solution.

LUN-level switching is supported for NPIV attachment and for native attachment of the
IBM Storwize family. It is also supported for a SAN Volume Controller split cluster environment, which can make this an attractive solution, especially in heterogeneous environments where the SAN Volume Controller split cluster is used as the basis for a cross-platform, two-site, HA solution.

Figure 1-2 LUN-level switching

1.3.1 SAN Volume Controller split cluster

In a standard SAN Volume Controller configuration, all nodes physically are within the same rack. Beginning with Version 5.1, support was provided for split-cluster configurations where nodes within an I/O group can be physically separated from one another by up to 10 km. This capability allows nodes to be placed in separate failure domains, which provide protection against failures that affect a single failure domain.

The initial support for split cluster that was delivered in Version 5.1 contained the restriction that all communication between the SAN Volume Controller node ports cannot traverse Inter-Switch Links (ISLs). This limited the maximum supported distance between failure domains. Starting with SAN Volume Controller Version 6.3, the ISL restriction was removed, which allowed the distance between failure domains to be extended to 300 km. Additionally, in SAN Volume Controller Version 6.3, the maximum supported distance for non-ISL configurations was extended to 40 km.

Important: Make sure that you understand the influence of latency on your applications when using larger distances between the cluster nodes.

The SAN Volume Controller split-cluster configuration provides a continuous availability platform where host access is maintained if any single failure domain is lost. This availability is accomplished through the inherent active architecture of the SAN Volume Controller with the use of volume mirroring. During a failure, the SAN Volume Controller nodes and associated mirror copy of the data remain online and available to service all host I/O.

The split-cluster configuration uses the SAN Volume Controller volume mirroring function. Volume mirroring allows the creation of one volume with two copies of MDisk extents. The two data copies, if placed in different MDisk groups, allow volume mirroring to eliminate the effect to volume availability if one or more MDisks fails. The resynchronization between both copies is incremental and is started by the SAN Volume Controller automatically. A mirrored volume has the same functions and behavior as a standard volume. In the SAN Volume Controller software stack, volume mirroring is below the cache and copy services. Therefore, FlashCopy, Metro Mirror, and Global Mirror have no awareness that a volume is mirrored. All operations that can be run on non-mirrored volumes can also be run on mirrored volumes.

LUN-level switching is commonly used with a SAN Volume Controller split cluster setup. This setup provides storage HA through the SAN Volume Controller setup and server HA through PowerHA. Another common setup is a three-node cluster where local HA is achieved by LUN-level switching and DR is possible on the third node in the cluster by using Metro or Global Mirror. In this three-node setup, you often use a SAN Volume Controller split cluster on the primary site and a Storwize V7000 or V3700 at the secondary site. Within the Storwize family, remote copy operations, such as Global Mirror and Metro Mirror, are possible between the different members of the family.

Important: Although SAN Volume Controller split cluster is a supported environment with PowerHA, there is no automatic change of the SAN Volume Controller preferred node when a failover or switchover is done. Because of the latency that is involved when you are reading from the remote SAN Volume Controller node, make sure that the distance between the SAN Volume Controller nodes is close enough to meet your disk response time expectations. In addition, you must have separate Fibre Channel adapters for the system ASP and the IASP. Also, ensure that you set the preferred node correctly when you are creating the vDisks on the SAN Volume Controller because changing this setting later requires the attached IBM i LPAR to be powered down.

1.4 Storwize family Copy Services

The Storwize family offers a common platform and single point of control for regular provisioning and management of heterogeneous storage and for advanced functions, such as Copy Services, that are enabled by the Storwize family virtualization-layer between storage systems of different architectures or from different vendors.

The following Copy Services functions are available for the Storwize family:

•Metro Mirror for synchronous remote replication (see 1.4.1, “Metro Mirror” on page 9)

•Global Mirror for asynchronous remote replication (see 1.4.2, “Global Mirror” on page 13)

•FlashCopy for point-in-time volume copies (see 1.4.3, “FlashCopy” on page 15)

For more information about the Storwize family, see these publications:

•Implementing the IBM System Storage SAN Volume Controller V7.4, SG24-7933

•IBM SAN Volume Controller 2145-DH8 Introduction and Implementation, SG24-8229

•Implementing the IBM Storwize V7000 and IBM Spectrum Virtualize V7.6, SG24-7938

•Implementing the IBM Storwize V5000, SG24-8162

•Implementing the IBM Storwize V3700, SG24-8107

1.4.1 Metro Mirror

Metro Mirror is a synchronous remote copy relationship between two Storwize volumes (VDisks) of equal (virtual) size. When a remote copy relationship (either Metro Mirror or Global Mirror) is established, the preferred primary volume is designated as the master volume and the preferred secondary volume as the auxiliary volume. While the secondary volume is available, every host write I/O sent to the Metro Mirror primary volume is acknowledged back to the host only after it is committed to the write cache of the primary Storwize and the secondary Storwize system (Figure 1-3).

Figure 1-3 Metro Mirror Storwize family write I/O processing

The role of a master or auxiliary volume is either primary or secondary, depending on the direction or failover state of the current remote copy relationship. Up to 2048 remote copy relationships are supported in a two-node Storwize cluster.

With the Storwize family, establishing a Copy Services relationship is done in two phases of creating the relationship first before starting it in a second step. This is different from, for example, IBM System Storage DS8000 Copy Services, for which establishing a Copy Services relationship is done in a single step by creating the out-of-sync bitmaps and starting the relationship automatically at its creation.

When creating the Metro Mirror relationship, the user can specify whether the auxiliary volume is already in sync with the master volume, and the background copy process is then skipped. The in-sync option (path 1a in Figure 1-4 on page 10) is intended to be used when the volumes that were created with the format option should not be used for IBM i volumes because IBM i specially formats the volumes itself when they are configured (that is, added to an IBM i ASP). Hence, using the Storwize format option at volume creation and the in-sync option when creating a Metro Mirror relationship does not make sense.

Bandwidth thresholds

At initial synchronization, data is copied in data chunks, called grains, of 256 KB by a background copy process from the primary to secondary remote copy volume with a default bandwidth limit of 50 MBps between both Storwize clusters. This partnership bandwidth limit is evenly divided by the nodes in the cluster and is an attribute of the remote copy partnership between both Storwize systems. The bandwidth should be less than (when still accounting for host write I/O updates during synchronization) or equal to the available replication link bandwidth between both systems, when no relevant host write update are expected during synchronization.

Also, this overall cluster partnership bandwidth should be chosen deliberately to not exceed the capabilities of the primary and auxiliary storage systems to prevent performance impacts for the foreground host I/O.

Additionally, there is also a relationship bandwidth limit for the maximum background copy rate for each remote copy relationship hat defaults to 25 MBps and is an attribute of the Storwize cluster configuration.

Remote copy relationship states

A Metro Mirror or Global Mirror remote copy volume relationship can be in one of these states:

•Consistent stopped

•Inconsistent stopped

•Consistent synchronized

•Inconsistent copying

•Idling

Figure 1-4 shows an overview of these states as they apply to a connected remote copy relationship and the conditions that cause a state transition.

Figure 1-4 SAN Volume Controller/V7000 remote copy volume states and transitions

The remote copy states can be described as follows:

Inconsistent stopped State after creating a remote copy relationship (without using the in sync option) or after a failure condition that occurred while the relationship was in inconsistent copying state. Secondary volume data is not consistent with primary volume data and due to the risk of (undetected) data inconsistency should not be accessed by an application server.

Inconsistent copying State after starting an inconsistent stopped or idling relationship with changes to be synchronized with a background copy process running to copy data from the primary to the secondary volume. The primary is accessible for read and write I/O, but the secondary is offline (that is, not accessible for either read or write I/O) while the background copy is running and the relationship is not consistent.

Consistent synchronized State of an inconsistent copying relationship after completion of the background copy process or of a restarted consistent stopped relationship. The primary volume is accessible for read and write I/O, but the secondary volume is accessible only for read I/O. A switch of the remote copy direction does not change this state.

Stopping the relationship takes it to the consistent stopped state.

Stopping the relationship with the -access parameter takes it to the idling state.

Switching the relationship leaves it in the consistent synchronized state but reverses the primary and secondary roles.

Consistent stopped State after stopping a consistent synchronized relationship or after it encountered an error that forced a consistency freeze. The secondary contains a consistent image but might be out-of-date regarding the primary, which might have received write updates from the host after the relationship entered this state. Restarting on a non-synchronized relationship that has had changes requires the -force CLI command parameter.

Idling State after stopping a consistent synchronized relationship with enabling write access to the secondary volume. Both master and auxiliary volumes operate in the primary role so that both master and auxiliary volumes are accessible for read and write I/O.

Changes are tracked for both the master and auxiliary volumes so that when starting the remote copy relationship again in the wanted direction, which is specified by the required -primary CLI command parameter, only a partial synchronization for the changed grains is needed. Restarting on a non-synchronized relationship that has had changes requires the -force CLI command parameter.

In addition to these states that are valid for a connected remote copy relationship (that is, one where the primary system can communicate with the secondary system), there is also a disconnected state of remote copy relationships where the primary system can no longer communicate with the secondary system. When the clusters can communicate again, the relationships automatically become connected again.

If the relationship or consistency group becomes disconnected, the primary volumes transition to inconsistent disconnected. The master side moves to idling disconnected.

The Storwize system logs informational events like remote copy relationship changes, loss of synchronization, or remote cluster communication errors in an error log for which SNMP traps, email notification, or syslog server messages can be configured to trigger either automation or alert the user for manual intervention.

Consistency groups

Consistency is a concept of the storage system ensuring write I/O processing in the same order in which write updates are received by the host and maintaining this order even with Copy Services relationships. It applies to a single relationship, but can also be applied to a set of relationships spanning multiple volumes by using consistency groups.

For Metro Mirror and Global Mirror remote copy relationships, maintaining the correct write order processing requires that in case of an error event causing loss of replication for only a subset of remote copy relationships of application servers, remote write update processing for the non-affected remote copy relationships is automatically stopped to ensure application server data consistency at the secondary site.

For FlashCopy point-in-time volume copies, maintaining consistency and correct write order processing requires that write I/O to all volumes of an application server in a consistency group is temporarily put on hold until all FlashCopy volume relationships for the consistency group are started. The storage system depends on the concept of dependent writes being implemented in the application logic to ensure consistency across multiple volumes in a consistency group (for example, that a journal is updated with the intended database update before the database itself is updated). This application logic write dependency ensures that when a SCSI queue full status is set as part of the consistency group formation for a volume, further dependent application writes are put on hold by the application so that the storage system can proceed setting SCSI queue full status for all remaining volumes and ensure dependent write data consistency for all volumes in the consistency group. This write dependency concept still applies for IBM i with its single-level storage architecture, as IBM i SLIC storage management holds off all I/O to a disk unit in a SCSI queue full condition, but does not stop the I/O to other disk units that are still available for I/O operations.

Up to 127 FlashCopy consistency groups with up to 512 FlashCopy volume relationships in a consistency group are supported in a Storwize system. For Metro Mirror and Global Mirror, up to 256 remote mirror consistency groups are supported by no limit imposed for the number of either Metro Mirror or Global Mirror remote copy relationships other than the limit of 2048 volumes supported per I/O node pair.

PowerHA SystemMirror for i inherently uses consistency groups for Storwize FlashCopy relationships and requires them to be configured for Metro Mirror or Global Mirror relationships. Due to the IBM i single-level storage architecture, which stripes the data across all disk units of an ASP, consistency groups should be defined on an IASP group level.

Note: Stand-alone volume copy relationships and consistency groups share a common configuration and state model. All volume copy relationships in a consistency group that is not empty have the same state as the consistency group.

1.4.2 Global Mirror

Global Mirror is an asynchronous remote copy relationship between two Storwize volumes (VDisks) of equal (virtual) size. When a remote copy relationship (either Metro Mirror or Global Mirror) is established, the preferred primary volume is designated as the master volume and the preferred secondary volume as the auxiliary volume. Every host write I/O sent to the Global Mirror primary volume is acknowledged back to the host after it is committed to the write cache of both nodes for the corresponding I/O group of the primary Storwize system. Later (that is, asynchronously), this write update is sent by the primary Storwize to the secondary Storwize system (Figure 1-5). Global Mirror provides the capability to perform remote copy over long distances, up to the maximum supported round-trip latency of 80 ms, exceeding the performance-related limitations of synchronous remote copy without host write I/O performance impacts caused by remote replication delays.

Figure 1-5 Global Mirror Storwize family write I/O processing

Though the data is sent asynchronously from the primary to the secondary Storwize system, the write ordering is maintained by sequence numbers that are assigned to acknowledged host write I/Os and with the secondary applying writes in order by their sequence number. Consistency of the data at the remote site is maintained always. However, during a failure condition, the data at the remote site might be missing recent updates that have not been sent or that were in-flight when a replication failure occurred, so using journaling to allow for proper crash consistent data recovery is of key importance.

Global Mirror volume relationship states and transitions are identical to those for Metro Mirror, as described in “Remote copy relationship states” on page 10.

A log file is used by the Storwize system for Global Mirror to maintain write ordering and help prevent host write I/O performance impacts when the host writes to a disk sector that is either in the process of being transmitted or due to bandwidth limits, is still waiting to be transmitted to the remote site. The Storwize system also uses shared sequence numbers to aggregate multiple concurrent (and dependent) write I/Os to minimize its Global Mirror processing impact.

Global Mirror link tolerance

Global Mirror uses a special link tolerance parameter that is defined at the cluster level that specifies the duration with a default of 300 seconds, for which inadequate intercluster link performance with write response times above 5 ms is tolerated. If this tolerated duration of degraded performance where the Storwize system must hold off writes to the primary volumes with the effect of synchronous replication-like degraded performance is exceeded, it stops the most busy active Global Mirror relationship consistency group to help protect the application server’s write I/O performance and logs an event with error code 1920. The link tolerance can be disabled by the user setting its value to 0. However, this function provides no protection for the application server’s write I/O performance in cases where there is congestion on either the replication link or the auxiliary storage system. Although you can use the link tolerance setting to define a period of accepted performance degradation, it is important to size correctly the remote copy replication bandwidth for the peak write I/O throughput and possible resync workload, and to help prevent longer production workload performance impacts.

The concept of consistency groups to ensure write-dependent data consistency applies to Global Mirror the same as previously described for Metro Mirror. Consistency groups are required to be configured for PowerHA SystemMirror for i.

Global Mirror with Change Volumes

Global Mirror with Change Volumes (GMCV) provides asynchronous replication based on point-in-time copies of data. It allows for effective replication over lower bandwidth networks and reduces any impact on production hosts. PowerHA supports GMCV.

Before the release of GMCV, Global Mirror ensured a consistent copy on the target side, but the recovery point objective (RPO) was not tunable, and usually within seconds. This solution required enough bandwidth to support the peak workload.

GMCV uses FlashCopy internally to ensure the consistent copy, but offers a tunable RPO, called a cycling period. GMCV might be appropriate when bandwidth is an issue, although if bandwidth cannot support the replication, the cycling period might need to be adjusted from seconds up to 24 hours.

Figure 1-6 shows a high-level conceptual view of GMCV. GMCV uses FlashCopy to maintain image consistency and to isolate host volumes from the replication process.

Figure 1-6 Global Mirror with Change Volumes

A FlashCopy mapping, called a change volume, exists on both the source and the target side. When replication begins, all data is sent from the source to the target, and then changes are tracked on the source change volume. At the end of each cycle period, the changes that are accumulated in the source change volume are sent to the target volume, which then stores that set of data as a consistent copy. If it takes longer than the cycling period to send the changes, then the next cycle period starts when the previous one finishes. The cycling period default is 300 seconds (5 minutes), but it can be adjusted by the user up to a maximum of
24 hours.

Change volumes hold point-in-time copies of 256 KB grains. If there is a change to any of the disk blocks in a given grain, that grain is copied to the change volume to preserve its contents. Change volumes are also maintained at the secondary site so that a consistent copy of the volume is always available even when the secondary volume is being updated.

GMCV also sends only one copy of a changed grain, which might be rewritten many times within the given cycle period. If the primary volume fails for any reason, GMCV ensures that the secondary volume holds the same data that the primary did at a given point. That period of data loss might be 5 minutes - 24 hours, but varies according to the design choices that you make.

Primary and change volumes are always in the same I/O group and the Change Volumes are always thin-provisioned. Change Volumes cannot be mapped to hosts and used for host I/O, and they cannot be used as a source for any other FlashCopy or Global Mirror operation.

With GMCV, a FlashCopy mapping, called a change volume, exists on both the source and target. When replication begins, all data is sent from the source to the target, and then changes are tracked on the source change volume. At the end of each cycle period, the changes that are accumulated in the source change volume are sent to the target change volume, which then stores that set of data as a consistent copy. If it takes longer than the cycling period to send the changes, then the next cycle period starts when the previous one finishes. The cycling period default is 300 seconds (5 minutes), but this can be adjusted by the user up to a maximum of 24 hours.

The RPO is determined by how long it takes for the cycle to complete. If the cycle completes within the configured cycle time, then the maximum RPO is 2 * cycle time. If all the changes cannot be written within the cycle period, then the maximum RPO is the sum of the previous and current cycle times. The cycle time should be configured in a way that it matches your RPO expectations (cycle time should be not more than half of your RPO) but that also matches the available bandwidth so that you do not regularly run into a situation where the source change volumes cannot be transferred to the target change volumes during the cycle time.

1.4.3 FlashCopy

The Storwize family FlashCopy function provides the capability to perform a point-in-time copy of one or more volumes (VDisks). In contrast to the remote copy functions of Metro Mirror and Global Mirror, which are intended primarily for DR and HA purposes, FlashCopy is typically used for online backup or creating a clone of a system or IASP for development, testing, reporting, or data mining purposes.

FlashCopy is supported only within the same Storwize system (though that Storwize system may consist of internal disks and other external storage systems). Up to 4096 FlashCopy relationships are supported per Storwize system, and up to 256 copies are supported per FlashCopy source volume. With Storwize V6.2 and later, a FlashCopy target volume can also be a non-active remote copy primary volume, which eases restores from a previous FlashCopy in a remote copy environment by using the FlashCopy reverse function.

PowerHA SystemMirror i supports these Storwize family FlashCopy functions, which are described in more detail below:

•FlashCopy no-copy and background copy

•Thin-provisioned (space-efficient) FlashCopy targets

•Incremental FlashCopy

•Reverse FlashCopy

•Multi-target FlashCopy (by using separate ASP copy descriptions for each target)

I/O indirection

The FlashCopy indirection layer, which is logically below the Storwize cache, acts as an I/O traffic director for active FlashCopy relationships. To preserve the point-in-time copy nature of a FlashCopy relationship, the host I/O is intercepted and handled according to whether it is directed at the source volume or at the target volume, depending on the nature of the I/O read or write and whether the corresponding grain already is copied. Figure 1-7 and Figure 1-8 illustrate the different processing of read and write I/O for active FlashCopy relationships by the indirection layer.

Figure 1-7 Storwize family FlashCopy read processing

Figure 1-8 Storwize family FlashCopy write processing

Although a fixed grain size of 256 KB is used for remote mirror volume relationships for FlashCopy, you can choose from the default grain size of 256 KB or alternatively from the smaller grain size of 64 KB as the granularity for tracking and managing out-of-sync data of a FlashCopy relationship.The concept of consistency groups to ensure that dependent write data consistency across multiple volume copy relationships applies to FlashCopy (see “Consistency groups” on page 12).

Background copy

A FlashCopy relationship can either be a no-copy or a background copy relationship. With a background copy relationship, any grain from the source volume is copied to the target volume. By default (that is, if not specifying the autodelete option), the relationship is retained even after all grains are copied. For a no-copy relationship, only grains that are modified on the source after starting the FlashCopy relationship are copied from the source volume to the target volume before the source grain is allowed to be updated (copy on write processing), if the corresponding grain on the target volume is been updated already by the host accessing the target volumes.

An option for FlashCopy is creating an incremental FlashCopy relationship, which uses background copy to copy all of the data from the source to the target for the first FlashCopy and then only the changes that occurred since the previous FlashCopy for all subsequent FlashCopy copies being started for the relationship. When creating a FlashCopy relationship, you can specify a copy rate for the background copy process, which can either be 0 (meaning that a FlashCopy no-copy relationship without a background copy is established) or any value 1 - 100, which converts to the background copy throughputs (Table 1-2).

Table 1-2 FlashCopy background copy rates

Copy rate value	Data copied	256 KB grains/s	64 KB grains/s
1 - 10	128 KBps	0.5	2
11 - 20	256 KBps	1	4
21 - 30	512 KBps	2	8
31 - 40	1 MBps	4	16
41 - 50¹	2 MBps	8	32
51 - 60	4 MBps	16	64
61 - 70	8 MBps	32	128
71 - 80	16 MBps	64	256
81 - 90	32 MBps	128	512
91 - 100	64 MBps	256	1024

¹ Default value

A FlashCopy relationship is established on a Storwize system in three steps:

1. Creating a FlashCopy relationship

This action triggers the internal creation of a FlashCopy out-of-sync bitmap that is used by the Storwize system for tracking the grains needing to be copied.

2. Preparing a FlashCopy relationship or consistency group

This action achieves consistency for the volumes by destaging the source volume’s modified data to disk, putting it in write-through mode, discarding the target volume’s cache data, and rejecting any I/O to the target volume.

3. Starting a FlashCopy relationship or consistency group

This action briefly pauses the I/O to the source volumes until all reads and writes below the Storwize cache layer complete and starts the actual FlashCopy relationship. The logical dependency between the source and target volume is established in the Storwize indirection layer.

FlashCopy relationship states

A FlashCopy volume relationship can be in any of the following states:

•Idle

•Copied

•Copying

•Stopped

•Stopping

•Suspended

•Preparing

•Prepared

The FlashCopy states are described here:

Idle or copied Mapping between source and target volume exists, but the source and the target behave as independent volumes.

Copying Background copy process is copying grains from the source to the target. Both the source and the target are available for read and write I/O, but the target depends on the source for grains that are not copied yet.

Stopped FlashCopy relationship was stopped either by the user or by an I/O error. The source volume is still assessable for read and write I/O, but the target volume is taken offline because data integrity is not provided. From a stopped state, the relationship can either be started again with the previous point-in-time image or lost or deleted if it is not needed anymore.

Stopping The relationship is in the process of transferring data to a dependent relationship. The source volume remains accessible for I/O, but the target volume remains online if the background copy process completed or is put offline if the background copy process is not completed while the relationship was in the copying state. Depending on whether the background copy completed, the relationship moves either to the idle/copied state or the stopped state.

Suspended This is the state of a relationship when access to metadata that is used by the copy process is lost. Both the source and the target are taken offline and the background copy is put on hold. When metadata becomes available again, the relationship returns to the copying state or the stopping state.

Preparing The source volume is placed in write-through mode and modified data of the source volume is destaged from the SAN Volume Controller/V7000 cache to create a consistent state of the source volume on disk in preparation for starting the relationship. Any read or write data that is associated with the target volume is discarded from the cache.

Prepared The relationship is ready to be started with the target volume in an offline state. Write performance for the source volume can be degraded, as it is in write-through mode.

Thin-provisioned FlashCopy

In addition to regular, full-provisioned volumes (VDisks), which at their creation have the full physical storage capacity allocated corresponding to their volume capacity, the Storwize family also supports thin-provisioned or space-efficient volumes, which are created with a virtual capacity that is reported to the host higher than the physical capacity pre-allocated to the volume. If the thin-provisioned volume is created with the autoexpand option, additional extents up to the virtual capacity of the volume are automatically allocated from the storage pool (MDisks group) as needed when the currently allocated physical capacity is exhausted.

Note: Using thin-provisioned Storwize volumes for IBM i is supported only for FlashCopy target volumes, not for volumes that are directly assigned to an IBM i host.

Using thin-provisioned volumes also does not make sense for remote copy secondary volumes, which become fully allocated at initial synchronization. Similarly, using thin-provisioned volumes for FlashCopy targets makes sense only for FlashCopy no-copy relationships that are used for a limited duration and have limited changes.

For optimal performance of thin-provisioned FlashCopy, the grain size for the thin-provisioned volume that is used as the FlashCopy target should match the grain size of the FlashCopy relationship. Additionally, for space-efficiency reasons, to help minimize physical storage allocations on the thin-provisioned target, consider also using the small grain size of 64 KB.

Multi-target and reverse FlashCopy

As previously mentioned, a single FlashCopy source volume supports up to 256 target volumes. Creating and maintaining multiple targets from a FlashCopy source volume from different times might be useful, for example, to have multiple points to restore from by using the reverse FlashCopy function (Figure 1-9).

Figure 1-9 Storwize family multi-target and reverse FlashCopy

A key advantage of the Storwize reverse FlashCopy function is that it does not destroy the original source and target relationship, so any processes that use the target (such as tape backup jobs) can continue to run uninterrupted. It does not require a possible background copy process to have completed, and regardless of whether the initial FlashCopy relationship is incremental, the reverse FlashCopy function copies data only from the original target to the source for grains that were modified on the source or target.

Consistency groups cannot contain more than one FlashCopy relationship with the same target volume, so they must be reversed by creating a set of reverse FlashCopy relationships and adding them to the new reverse consistency group.

The Storwize family performs the copy on write processing for multi-target relationships in a way that data is not copied to all targets, but only once from the source volume to the newest target so that older targets refer to newer targets first before referring to the source. Newer targets together with the source can be regarded as composite source for older targets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 1. Introduction to using IBM Storwize with PowerHA solutions on IBM i

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 1. Introduction to using IBM Storwize with PowerHA solutions on IBM i