Planning
This chapter describes steps that are required to plan the installation of an IBM System Storage SAN Volume Controller in your storage network.
This chapter includes the following topics:
Planning IP connectivity
Planning your storage network
Planning back-end storage connectivity, storage pools and volumes
Planning your host connectivity
Planning for advanced copy services and data migration
Performance considerations
3.1 General planning rules
 
Important: At the time of writing, the statements provided in this book are correct, but they might change. Always verify any statements that are made in this book with the IBM Storwize V7000 supported hardware list, device driver, firmware, and recommended software levels that are available at the following websites:
Support Information for Storwize V7000
IBM System Storage Interoperation Center (SSIC):
To maximise benefit from the Storwize V7000, pre-installation planning must include several important steps. These steps ensure that the Storwize V7000 provides the best possible performance, reliability, and ease of management for your application needs. The correct configuration also helps minimize downtime by avoiding changes to the Storwize V7000 and the storage area network (SAN) environment to meet future growth needs.
This book is not intended to provide in-depth information about the described topics. For an enhanced analysis of advanced topics, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521, at:
3.1.1 Basic planning flow
The general rule of planning is to define your goals, and then plan a solution that can be shown to meet these goals. Always remember to verify that each element of your configuration is supported.
Below is a list of items that you should consider when planning for the Storwize V7000:
Collect and document the number of hosts (application servers) to attach to the Storwize V7000. Identify the traffic profile activity (read or write, sequential, or random), and the performance requirements (bandwidth and input/output (I/O) operations per second (IOPS)) for each host.
Decide if you are going to use Storwize V7000 to virtualize external storage. If you do, then collect and document the following:
 – Information on the existing back-end storage that is present in the environment and is intended to be virtualized by the Storwize V7000.
 – Whether you need to configure image mode volumes. If you want to use image mode volumes then decide if and how you plan to migrate them into managed mode volumes.
 – Information on the planned new back-end storage to be virtualized by the Storwize V7000.
 – The required virtual storage capacity for fully provisioned and space-efficient (SE) volumes.
 – The required storage capacity for local mirror copy (volume mirroring).
 – The required storage capacity for point-in-time copy (IBM FlashCopy).
 – The required storage capacity for remote copy (Metro Mirror and Global Mirror).
 – The required storage capacity for compressed volumes.
 – The required storage capacity for encrypted volumes.
 – Shared storage (volumes presented to more than one host) required in your environment.
Define the amount of internal storage that you plan to provision from internal arrays. Consider the following information:
 – Amount of storage in each drive tier that you plan to deploy.
 – The number of drives that you need to provision the required storage. Take into account RAID levels, number of drives per array and hot spares for each tier. Define number of drives and their form factor.
 
Note: Use DRAID 6 as the default RAID level for your arrays.
 – The number of expansion enclosures you need to hold the drives required to provision the required storage.
Per host:
 – Volume capacity.
 – Logical unit number (LUN) quantity.
 – Volume sizes.
 
Note: When planning the capacities, make explicit notes if the numbers state the net storage capacity (that is, available to be used by applications running on any host), or gross capacity, which includes overhead for spare drives (both due to RAID redundancy and planned hot spare drives) and for file system metadata. For file system metadata, include overhead incurred by all layers of storage virtualization. In particular, if you plan storage for virtual machines whose drives are actualized as files on a parallel file system, then include metadata overhead for the storage virtualization technology used by your hypervisor software.
Decide whether you need to plan for more than one site. For multi-site deployment review the additional configuration requirements imposed.
Define the number of clusters and the number of enclosures (1 - 4) for each cluster. The number of necessary I/O Groups depends on the overall performance requirements and the number of hosts you plan to attach.
Decide if you are going to use N_Port ID Virtualization (NPIV). If you plan to use NPIV then review the additional configuration requirements imposed.
Design the SAN according to the requirement for high availability (HA) and best performance. Consider the total number of ports and the bandwidth that is needed at each link, especially Inter-Switch Links (ISLs). Consider ISL trunking for improved performance. Separately collect requirements for Fibre Channel and IP-based storage network.
 
Note: Check and carefully count the required ports. Separately note the ports dedicated for extended links. Especially in an enhanced stretched cluster (ESC) or HyperSwap environment, you might need additional long wave gigabit interface converters (GBICs).
Define a naming convention for the Storwize V7000 clusters, nodes, hosts, and storage objects.
Define the Storwize V7000 service Internet Protocol (IP) addresses and the system’s management IP addresses.
Define subnets for the Storwize V7000 system and for the hosts for Internet Small Computer System Interface (iSCSI) connectivity.
Define the IP addresses for IP replication (if required).
Define back-end storage that will be used by the system.
Define the managed disks (MDisks) in the back-end storage to be used by Storwize V7000.
Define the storage pools, specify MDisks for each pool and document mapping of MDisks to back-end storage. Parameters of the back-end storage determine the characteristics of the volumes in the pool. Make sure that each pool contains MDisks of similar (ideally, identical) performance characteristics.
Plan allocation of hosts and volumes to I/O Groups to optimize the I/O load distribution between the hosts and the Storwize V7000. Allowing a host to access more than one I/O group might better distribute the load between system nodes. However, doing so will reduce the maximum number of hosts attached to the Storwize V7000.
Plan queue depths for the attached hosts. For more information, see this website:
Plan for the physical location of the equipment in the rack.
Verify that your planned environment is a supported configuration.
Verify that your planned environment does not exceed system configuration limits.
Planning activities required for Storwize V7000 deployment are described in the following sections.
3.2 Planning for availability
When planning deployment of IBM Storwize V7000, avoid creating single points of failure. Plan system availability according to the requirements specified for your solution. Consider the following aspects, depending on your availability needs:
Single site or multi-site configuration
Multi site configurations increase solution resiliency and can be the basis of disaster recovery solutions. Storwize V7000 allows configuration of multi-site solutions, with sites working in active-active or active-standby mode. Both synchronous and asynchronous data replication are supported with multiple inter-site link options.
If you require a cross-site configuration, see IBM Storwize V7000, Spectrum Virtualize, HyperSwap, and VMware Implementation, SG24-8317.
Physical separation of system building blocks
Dual rack deployment might increase availability of your system if your back-end storage, SAN and LAN infrastructure also do not use single rack placement scheme. You can further increase system availability by ensuring that cluster nodes are powered from different power circuits and are located in different fire protection zones.
Quorum availability
The Storwize V7000 uses three drives that are configured as array members or hot spares, as quorum disks for the clustered system. If there are no external MDisks to use as quorum disks, then the preferred practice is to distribute quorum disks over SAS channels to increase system resiliency. The current locations of the quorum disks can be displayed by using the lsquorum command, and relocated by using the chquorum command.
For clustered Storwize V7000 (two or more I/O Groups) systems that do not virtualize external storage (and therefore cannot place a quorum disk on an external MDisk), generally use IP quorum device to reduce the risk of outage in case of a split-brain cluster.
For HyperSwap/Stretched systems, it is recommended to have at least two independent IP quorum devices defined to prevent the active quorum device placement in one of sites that hold Storwize V7000 HyperSwap/Stretched storage components. Additionally, make sure that there is at least one quorum drive in each site.
The IP quorum device acts only as a tie breaker in case of a split-brain cluster scenario. Quorum disks are still required to store cluster state.
Failure domain sizes
Failure of an MDisks takes offline the whole storage pool that contains this MDisk. To reduce impact of an MDisk failure, consider reducing the number of back-end storage systems per storage pool, and increasing the number of storage pools and reducing their size. Note that this configuration in turn limits the maximum performance of the pool (fewer back-end systems to share the load), increases storage management effort, can lead to less efficient storage capacity consumption, and might be subject to limitation by system configuration maximums.
For internal storage, you might be able to define arrays in such a way that they can survive an expansion drive failure. However, such array design might restrict performance, available array sizes and RAID levels, depending on your hardware configuration.
Consistency
Strive to achieve consistent availability levels of all system building blocks.
3.3 Connectivity planning
IBM Storwize V7000 offers a wide range of connectivity options, both to back-end storage and to hosts. They include Fibre Channel (FC) SAN (8 and 16 Gbps, including direct attachment for some purposes), iSCSI (with 1 Gbps and 10 Gbps ports, depending on hardware configuration), and FCoE connectivity on 10 Gbps ports.
Storwize V7000 supports SAN routing technologies between Storwize V7000 and storage systems, as long as the routing stays entirely within Fibre Channel connectivity and does not use other transport technologies such as IP. However, SAN routing technologies (including FCIP links) are supported for connections between the Storwize V7000 and hosts. The use of long-distance FCIP connections might degrade the storage performance for any servers that are attached through this technology.
Table 3-1 shows the fabric type that can be used for communicating between hosts, nodes, and back-end storage systems. All fabric types can be used at the same time.
Table 3-1 Storwize V7000 communication options
Communication type
Host to Storwize V7000
Storwize V7000 to storage
Storwize V7000 to
Storwize V7000
Fibre Channel (FC) SAN
Yes
Yes
Yes
iSCSI (1 GbE or 10 GbE)
Yes
Yes
No
FCoE (10 GbE)
Yes
Yes
Yes
When you plan deployment of Storwize V7000, identify networking technologies that you will use.
3.4 Physical planning
You must consider several key factors when you are planning the physical site of a Storwize V7000 installation. The physical site must have the following characteristics:
Meets power, cooling, and location requirements of the Storwize V7000 nodes.
Has two separate power sources.
There is sufficient rack space for installation of controller and disk expansion enclosures.
Has sufficient maximum power rating of the rack. Plan your rack placement carefully to not exceed maximum power rating of the rack. For more information about the power and environmental requirements, see the following website:
Your Storwize V7000 2076-524 and Storwize V7000 2076-624 order includes a printed copy of the IBM Storwize V7000 Gen2 and Gen2+ Quick Installation Guide, which also provides information about environmental and power requirements.
3.4.1 Cabling
Create a cable connection table that follows your environment’s documentation procedure to track all of the following connections that are required for the setup:
Power
Ethernet
SAS
iSCSI or Fibre Channel over Ethernet (FCoE) connections
Switch ports (FC, Ethernet, and FCoE)
Distribute your disk expansion enclosures evenly between control enclosures, nodes within control enclosures, and SAS channels within nodes. Review SAS cabling guidelines defined at this website:
When planning SAN cabling make sure, that your physical topology allows you to observe zoning rules and recommendations.
If the data center provides more than one power source, make sure that you use that capacity when planning power cabling for your system.
3.5 Planning IP connectivity
System management is performed through an embedded graphical user interface (GUI) running on the nodes. To access the management GUI, direct a web browser to the system management IP address.
Storwize V7000 Gen2 and Storwize V7000 Gen2+ nodes have a new feature called a Technician port. It is an Ethernet marked with a T. All initial configuration for each node is performed by using the Technician port. The port runs a Dynamic Host Configuration Protocol (DHCP) service so that any notebook or computer connected to the port is automatically assigned an IP address.
After the cluster configuration has been completed, the Technician port automatically routes the connected user directly to the service GUI.
 
Note: The default IP address for the Technician port is 192.168.0.1. If the Technician port is connected to a switch, it is disabled and an error is logged.
Each Storwize V7000 node requires one Ethernet cable to connect it to an Ethernet switch or hub. The cable must be connected to port 1. A 10/100/1000 megabit (Mb) Ethernet connection is supported on the port. Both Internet Protocol Version 4 (IPv4) and Internet Protocol Version 6 (IPv6) are supported.
 
Note: For increased availability, an optional second Ethernet connection is supported for each Storwize V7000 node.
Ethernet port 1 on every node must be connected to the same set of subnets. The same rule applies to Ethernet port 2 if it is used. However, the subnets available for Ethernet port 1 do not have to be the same as configured for interfaces on Ethernet port 2.
Each Storwize V7000 cluster has a Cluster Management IP address, in addition to a Service IP address for each node in the cluster. See Example 3-1 for details.
Example 3-1 System addressing example
management IP add. 10.11.12.120
node 1 service IP add. 10.11.12.121
node 2 service IP add. 10.11.12.122
Each node in an Storwize V7000 clustered system needs to have at least one Ethernet connection. Both IPv4 and IPv6 addresses are supported. Storwize V7000 can operate with either Internet Protocol or with both internet protocols concurrently.
For configuration and management, you must allocate an IP address to the system, which is referred to as the management IP address. For additional fault tolerance, you can also configure a second IP address for the second Ethernet port on the node.The addresses must be fixed addresses. If both IPv4 and IPv6 are operating concurrently, an address is required for each protocol.
 
Note: The management IP address cannot be the same as any of the defined service IPs.
Support for iSCSI enables one additional IPv4 address, IPv6 address, or both for each Ethernet port on every node. These IP addresses are independent of the system’s management and service IP addresses.
If you configure management IP on both Ethernet ports, choose one of the IP addresses to connect to GUI or CLI. Note that the system is not able to automatically fail over the management IP address to a different port. If one management IP address is unavailable, use an IP address on the alternate network. Clients might be able to use the intelligence in domain name servers (DNSs) to provide partial failover. Figure 3-1 shows a simple IP addressing scheme using a single subnet for both iSCSI and management.
Figure 3-1 IP addressing scheme: Use of a single subnet
3.5.1 Firewall planning
After you have your IP network planned, identify list of network flows required for the correct functioning of the environment. The list must specify source IP address, destination IP addresses, and required protocols/ports for each flow. Present the list to the firewall administrators and request set up of appropriate firewall rules. See https://ibm.biz/BdjGkF for the list of mandatory and optional network flows required for operation of IBM SAN Virtualization Controller.
3.6 SAN configuration planning
Storwize V7000 cluster can be configured with a minimum of two (and up to eight) Storwize V7000 nodes. These nodes can use SAN fabric to communicate with back-end storage subsystems and hosts.
3.6.1 Physical topology
The switch configuration in an Storwize V7000 fabric must comply with the switch manufacturer’s configuration rules, which can impose restrictions on the switch configuration. For example, a switch manufacturer might limit the number of supported switches in a SAN. Operation outside of the switch manufacturer’s rules is not supported.
The hardware compatible with V8.1 supports 8 Gbps, and 16 Gbps FC fabric, depending on the hardware platform and on the switch to which the Storwize V7000 is connected. In an environment where you have a fabric with multiple-speed switches, the preferred practice is to connect the Storwize V7000 and back-end storage systems to the switch operating at the highest speed.
You can use the lsfabric command to generate a report that displays the connectivity between nodes and other controllers and hosts. This report is helpful for diagnosing SAN problems.
Storwize V7000 control enclosure contains a pair of nodes (I/O Groups). Odd number of nodes in a cluster is a valid configuration only if a node fails or is removed from the configuration. In such case the remaining node operates in a degraded mode.
For clustered systems avoid, if possible, communication between nodes that route across ISLs. Connect all nodes to the same Fibre Channel or FCF switches.
Direct connection of the system Fibre Channel ports without using a Fibre Channel switch is supported. Such direct connections between the system nodes might be useful in small configurations where there is no Fibre Channel switch. It can also be used to connect nodes in the same I/O group to provide a dedicated connection for mirroring the fast write cache data.
No more than three ISL hops are permitted among nodes that are in the same system though in different I/O groups. If your configuration requires more than three ISL hops for nodes that are in the same system but in different I/O groups, contact your support center.
Avoid ISL on the path between nodes and back-end storage. If possible, connect all storage systems to the same Fibre Channel or FCF switches as the nodes. One ISL hop between the nodes and the storage systems is permitted. If your configuration requires more than one ISL, contact your support center.
In larger configurations, it is common to have ISLs between host systems and the nodes.
To verify the supported connection speed for FC links to the Storwize V7000, use IBM System Storage Interoperation Center (SSIC) site:
For information about configuration of Enhanced Stretched Cluster or HyperSwap setup, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521.
3.6.2 Zoning
In Storwize V7000 deployments, the SAN fabric must have two distinct zone classes:
Host zones: Allows communication between Storwize V7000Storwize V7000 and hosts.
Storage zone: Allows communication between Storwize V7000 and back-end storage.
In clustered configurations a third zone is required, allowing communication between storage system nodes (intra-cluster traffic).
Figure 3-2 shows the Storwize V7000 zoning classes.
Figure 3-2 Storwize V7000 zoning classes
The subsequent sections contain fundamental rules of Storwize V7000 zoning. However, also review the latest zoning guidelines and requirements at the following site when designing zoning for the planned solution:
 
Note: Configurations that use Metro Mirror, Global Mirror, N_Port ID Virtualization, or long-distance links have extra zoning requirements. Do not follow just the general zoning rules if you plan to use any of the above.
The FCoE fabric should employ the same set of zoning rules as the Fibre Channel fabric.
3.6.3 Storwize V7000 cluster system zone
The Storwize V7000 cluster system zone is required only if you deploy solution with more than one control enclosure. The purpose of cluster system zone is to enable traffic between all Storwize V7000 nodes within the clustered system. This traffic consists of heartbeats, cache synchronisation, and other data that nodes must exchange to maintain a healthy cluster state.
Each Storwize V7000 port must be zoned so that it can be used for internode communications. A system node cannot have more than 16 paths to another node in the same system.
Mixed port speeds are not possible for intracluster communication. All node ports within a clustered system must be running at the same speed.
Storwize V7000 supports the use of mixed fabrics for communication between nodes. The 10 GbE FCoE ports of one Storwize V7000 can be zoned to the FC ports of another node that is part of the same clustered system.
 
Note: You can use more than four fabric ports per node to improve peak load I/O performance. However, if a node receives more than 16 logins from another node, then it causes node error 860. To avoid that error, you need to use zoning, port masking, or combination of the two. For more information, see 3.6.7, “Port designation recommendations” on page 60, 3.6.8, “Port masking” on page 61, and the IBM Storwize V7000 documentation at:
3.6.4 Back-end storage zones
Create one Storwize V7000 storage zone for each back-end storage subsystem that is virtualized by the Storwize V7000.
A storage controller can present LUNs to the Storwize V7000 (as MDisks) and to other hosts in the SAN. However, if this is the case it is better to allocate different ports on the back-end storage for communication with Storwize V7000 and for hosts traffic.
All nodes in a system must be able to connect to the same set of storage system ports on each device. A system that contains any two nodes that cannot connect to the same set of storage-system ports is considered degraded. In this situation, a system error is logged that requires a repair action. This rule can have important effects on a storage system. For example, an IBM DS4000® series controller can have exclusion rules that determine to which host bus adapter (HBA) worldwide node names (WWNNs) that a storage partition can be mapped to.
In the example shown in Figure 3-3, all back-end storage ports are connected to all Storwize V7000 ports. Therefore, each port on back-end storage is zoned to all ports of the Storwize V7000 that are connected to the same fabric. For clarity, the figure shows only paths for one back-end storage port per controller: Port P1 in Fabric 0 and port P4 in Fabric 1.
Figure 3-3 Example of back-end storage subsystem zoning
There might be particular zoning rules governing attachment of specific back-end storage systems. Review guidelines at the following website to verify whether you need to consider additional policies when planning zoning for your back-end systems:
3.6.5 Host zones
Host must be zoned to the I/O Group to be able to access volumes presented by this I/O Group.
The preferred zoning policy is to create a separate zone for each host HBA port, and place exactly one port from each node in each I/O group that the host accesses in this zone. For deployments with more than 64 hosts defined in the system, this host zoning scheme is mandatory.
If you plan to use NPIV, review additional host zoning requirements at:
When a dual-core SAN design is used, it is a requirement that no internode communications use the ISL link. When you create host zones in this type of configuration, ensure that each system port in the host zone is attached to the same Fibre Channel switch.
Consider the following rules for zoning hosts with the Storwize V7000:
HBA to Storwize V7000 port zones
Place each host’s HBA in a separate zone with exactly one port from each node in each I/O group that the host accesses.
It is not prohibited to zone host’s HBA to one port from every node in the cluster, but it will reduce the maximum number of hosts that can be attached to the system.
 
Number of paths: For n + 1 redundancy, use the following number of paths:
With two HBA ports, zone HBA ports to Storwize V7000 ports 1:2 for a total of four paths.
With four HBA ports, zone HBA ports to Storwize V7000 ports 1:1 for a total of four paths.
Optional (n+2 redundancy): With four HBA ports, zone HBA ports to Storwize V7000 ports 1:2 for a total of eight paths.
Here, the term HBA port is used to describe the SCSI initiator and Storwize V7000 port to describe the SCSI target.
Maximum host paths per logical unit (LU)
For any volume, the number of paths through the SAN from the Storwize V7000 nodes to a host must not exceed eight. For most configurations, four paths to an I/O Group are sufficient.
 
Important: The maximum number of host paths per LUN must not exceed eight.
Another way to control the number of paths between hosts and the Storwize V7000 is to use port mask. The port mask is an optional parameter of the mkhost and chhost commands. The port-mask configuration has no effect on iSCSI connections.
For each login between a host Fibre Channel port and node Fibre Channel port, the node examines the port mask for the associated host object. It then determines whether access is allowed (port mask bit for given port is set) or denied (port mask bit is cleared). If access is denied, the node responds to SCSI commands as though the HBA WWPN is unknown. The port mask is 64 bits. Valid mask values range from all 0s (no ports enabled) to all 1s (all ports enabled). For example, a mask of 0011 enables port 1 and port 2. The default value is all 1s.
Balanced host load across HBA ports
If the host has more than one HBA port per fabric, zone each host port with a separate group of Storwize V7000 ports.
Balanced host load across Storwize V7000 ports
To obtain the best overall performance of the subsystem and to prevent overloading, the load of each Storwize V7000 port should be equal. Assuming similar load generated by each host, you can achieve this balance by zoning approximately the same number of host ports to each Storwize V7000 port.
Figure 3-4 shows an example of a balanced zoning configuration that was created by completing the following steps:
1. Divide ports on the I/O Group into two disjoint sets, such that each set contains two ports from each I/O Group node, each connected to a different fabric.
For consistency, use the same port number on each I/O Group node. The example on Figure 3-4 assigns ports 1 and 4 to one port set, and ports 2 and 3 to the second set.
Because the I/O Group nodes have four FC ports each, two port sets are created.
2. Divide hosts attached to the I/O Group into two equally numerous groups.
In general, for I/O Group nodes with more than four ports, divide the hosts into as many groups as you created sets in step 1.
3. Map each host group to exactly one port set.
4. Zone all hosts from each group to the corresponding set of I/O Group node ports.
The host connections in the example on Figure 3-4 on page 58 are defined in the following manner:
 – Hosts in group one are always zoned to ports 1 and 4 on both nodes.
 – Hosts in group two are always zoned to ports 2 and 3 on both nodes of the I/O Group.
 
Tip: Create an alias for the I/O Group port set. This step makes it easier to correctly zone hosts to the correct set of I/O Group ports. Additionally, it also makes host group membership visible in the FC switch configuration.
The use of this schema provides four paths to one I/O Group for each host, and helps to maintain an equal distribution of host connections on Storwize V7000 ports.
 
Tip: To maximize performance from the host point of view, distribute volumes mapped to each host between both I/O Group nodes.
Figure 3-4 Overview of four-path host zoning
When possible, use the minimum number of paths that are necessary to achieve a sufficient level of redundancy. For the Storwize V7000 environment, no more than four paths per I/O Group are required to accomplish this layout.
All paths must be managed by the multipath driver on the host side. Make sure that the multipath driver on each server is capable of handling the number of paths required to access all volumes mapped to the host.
For hosts that use four HBAs/ports with eight connections to an I/O Group, use the zoning schema that is shown in Figure 3-5. You can combine this schema with the previous four-path zoning schema.
Figure 3-5 Overview of eight-path host zoning
For more information see Chapter 8, “Hosts” on page 317.
3.6.6 Zoning considerations for Metro Mirror and Global Mirror
The SAN configurations that use intercluster Metro Mirror and Global Mirror relationships require the following other switch zoning considerations:
Review the latest requirements and recommendations at this website:
If there are two ISLs connecting the sites, then split the ports from each node between the ISLs. That is, exactly one port from each node must be zoned across each ISL.
Local clustered system zoning continues to follow the standard requirement for all ports on all nodes in a clustered system to be zoned to one another.
When designing zoning for a geographically dispersed solution, consider the effect of the cross-site links on the performance of the local system.
 
Important: Be careful when you perform the zoning so that ports dedicated for intra-cluster communication are not used for Host/Storage traffic in the 8-port and 12-port configurations.
The use of mixed port speeds for intercluster communication can lead to port congestion, which can negatively affect the performance and resiliency of the SAN. Therefore, it is not supported.
 
Important: If you zone two Fibre Channel ports on each node in the local system to two Fibre Channel ports on each node in the remote system, you will be able to limit the impact of severe and abrupt overload of the intercluster link on system operations.
If you zone all node ports for intercluster communication and the intercluster link becomes severely and abruptly overloaded, the local FC fabric can become congested so that no FC ports on the local Storwize V7000 nodes can perform local intracluster heartbeat communication. This situation can, in turn, result in the nodes experiencing lease expiry events.
In a lease expiry event, a node restarts to attempt to reestablish communication with the other nodes in the clustered system. If the leases for all nodes expire simultaneously, a loss of host access to volumes can occur during the restart events.
For more information about zoning best practices, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521.
3.6.7 Port designation recommendations
The intracluster communication is used for mirroring write cache and metadata exchange between nodes, and is critical to the stable operation of the cluster. It is possible to upgrade nodes beyond the standard 4 FC port configuration. Such upgrade provides an opportunity to dedicate ports to local node traffic, separating them from other cluster traffic on the remaining ports. This configuration provides a level of protection against malfunctioning devices and workload spikes that might otherwise impact the intracluster traffic.
Additionally, there is a benefit in isolating remote replication traffic to dedicated ports, and ensuring that any problems that affect the cluster-to-cluster interconnect do not impact all ports on the local cluster.
IBM suggests the port designations for isolating both port-to-local and port-to-remote canister traffic for Storwize V7000 Gen2 and Gen2+ canisters as shown in Figure 3-6.
Figure 3-6 Port masking configuration on Storwize V7000
 
Note: With 12 or more ports per node, four ports should be dedicated for because-node traffic. Doing so is especially important when high write data rates are expected as all writes are mirrored between I/O Group nodes over these ports.
The port designation patterns shown in the tables provide the required traffic isolation and simplify migrations to configurations with greater number of ports. More complicated port mapping configurations that spread the port traffic across the adapters are supported and can be considered. However, these approaches do not appreciably increase availability of the solution.
Alternative port mappings that spread traffic across HBAs might allow adapters to come back online following a failure. However, they do not prevent a node from going offline temporarily to restart and attempt to isolate the failed adapter and then rejoin the cluster. Also, the mean time between failures (MTBF) of the adapter is not significantly shorter than that of the non-redundant node components. The presented approach takes all of these considerations into account with a view that increased complexity can lead to migration challenges in the future, and simpler approach is usually better.
3.6.8 Port masking
You can use a port mask to control the node target ports that a host can access. Using local FC port masking, you can set which ports can be used for node-to-node/intracluster communication. Using remote FC port masking, you can set which ports can be used for replication communication. Port masking, combined with zoning, enables you to dedicate ports to particular type of traffic. Setting up Fibre Channel port masks is particularly useful when you have more than four Fibre Channel ports on any node in the system, as it saves setting up many SAN zones.
There are two Fibre Channel port masks on a system. The local port mask controls connectivity to other nodes in the same system, and the partner port mask controls connectivity to nodes in remote, partnered systems. By default, all ports are enabled for both local and partner connectivity.
The port masks apply to all nodes on a system; a different port mask cannot be set on nodes in the same system. You do not have to have the same port mask on partnered systems.
A mixed traffic of host, back-end, intracluster, and replication might cause congestion and buffer to buffer credit exhaustion. This type of traffic can result in heavy degradation of performance in your storage environment.
Fibre Channel IO ports are logical ports, which can exist on Fibre Channel platform ports or on FCoE platform ports.
The port mask is a 64-bit field that applies to all nodes in the cluster. In the local FC port masking, you can set a port to be dedicated to node-to-node/intracluster traffic by setting a 1 to that port. Remote FC port masking allows you to set which ports can be used for replication traffic by setting 1 to that port. If a port has a 0 in the specific mask, it means no traffic of that type is allowed. Therefore, in a local FC port map, a 0 means no node-to-node traffic will happen, and a 0 on the remote FC port masking means that no replication traffic will happen on that port. Therefore, if a port has a 0 on both local and remote FC port masking, only host and back-end storage traffic is allowed on it.
Setting port mask by using the CLI and GUI
The command to apply a local FC port mask on CLI is chsystem -localfcportmask mask. The command to apply a remote FC port mask is chsystem -partnerfcportmask mask.
If you are using the GUI, click Settings  Network  Fibre Channel Ports. Then, you can select the use of a port. Setting none means no node-to-node and no replication traffic is allowed, and only host and storage traffic is allowed. Setting local means only node-to-node traffic is allowed, and remote means that only replication traffic is allowed. Figure 3-7 shows an example of setting a port mask on port 1 to “Local”.
Figure 3-7 Fibre Channel port mask setting from GUI
3.7 iSCSI configuration planning
Each Storwize V7000 node is equipped with up to three onboard Ethernet network interface cards (NICs), which can operate at a link speed of 10 Mbps, 100 Mbps, or 1000 Mbps. All NICs can be used to carry iSCSI traffic. For optimal performance, use a 1 Gbps links between Storwize V7000 and iSCSI-attached hosts when the Storwize V7000 node’s onboard NICs are used.
An optional 10 Gbps 2-port Ethernet adapter is available. One such adapter can be installed in each node and can be used to add 10 Gb iSCSI/FCoE connectivity to the Storwize V7000.
Figure 3-8 shows an overview of the iSCSI implementation in the Storwize V7000.
Figure 3-8 Storwize V7000 iSCSI overview
Both onboard Ethernet ports of a Storwize V7000 node can be configured for iSCSI. For each instance of an iSCSI target node (that is, each Storwize V7000 node), you can define two IPv4 and two IPv6 addresses or iSCSI network portals:
If the optional 10 Gbps Ethernet feature is installed, you can use them for iSCSI traffic.
Generally, enable jumbo frames in your iSCSI storage network.
iSCSI IP addresses can be configured for one or more nodes.
Decide whether you implement authentication for the host to Storwize V7000 iSCSI communication. The Storwize V7000 supports the Challenge Handshake Authentication Protocol (CHAP) authentication methods for iSCSI.
3.7.1 iSCSI protocol
The iSCSI connectivity is a software feature that is provided by the Storwize V7000 code. The iSCSI protocol is a block-level protocol that encapsulates SCSI commands into Transmission Control Protocol/Internet Protocol (TCP/IP) packets. Therefore, iSCSI uses IP network rather than requiring the Fibre Channel infrastructure. The iSCSI standard is defined by Request For Comments (RFC) 3720, which is available at:
An introduction to the workings of iSCSI protocol can be found in iSCSI Implementation and Best Practices on IBM Storwize Storage Systems, SG24-8327.
3.7.2 Topology and IP addressing
See 3.5, “Planning IP connectivity” on page 51 for examples of topology and addressing schemes that can be used for iSCSI connectivity.
If you plan to use node’s 1 Gbps Ethernet ports for iSCSI host attachment, dedicate Ethernet port one for the Storwize V7000 management and port two for iSCSI use. This way, port two can be connected to a separate network segment or virtual local area network (VLAN) dedicated to iSCSI traffic.
 
Note: Ethernet link aggregation (port trunking) or channel bonding for the Storwize V7000 nodes’ Ethernet ports is not supported for the 1 Gbps ports.
3.7.3 General recommendation
This section covers general preferences related to iSCSI.
Planning for host attachments
An iSCSI client, which is known as an iSCSI initiator, sends SCSI commands over an IP network to an iSCSI target. A single iSCSI initiator or iSCSI target is called an iSCSI node.
You can use the following types of iSCSI initiators in host systems:
Software initiator: Available for most operating systems (OS), including AIX, Linux, and Windows.
Hardware initiator: Implemented as a network adapter with an integrated iSCSI processing unit, which is also known as an iSCSI HBA.
Make sure that iSCSI initiators, targets, or both that you plan to use are supported. Use the following sites for reference:
IBM Storwize V7000 V8.1 Support page:
IBM Knowledge Center for IBM Storwize V7000:
IBM System Storage Interoperation Center (SSIC)
iSCSI qualified name
A Storwize V7000 cluster can provide up to eight iSCSI targets, one per node. Each Storwize V7000 node has its own IQN, which, by default, is in the following form:
iqn.1986-03.com.ibm:2145.<clustername>.<nodename>
An alias string can also be associated with an iSCSI node. The alias enables an organization to associate a string with the iSCSI name. However, the alias string is not a substitute for the iSCSI name.
 
Note: The cluster name and node name form part of the IQN. Changing any of them might require reconfiguration of all iSCSI nodes that communicate with the Storwize V7000.
3.7.4 iSCSI back-end storage attachment
IBM Spectrum Virtualize V7.7 introduced support for external storage controllers that are attached through iSCSI.
For more information about back-end storage supported for iSCSI connectivity, see these websites:
IBM Support Information for Storwize V7000
IBM System Storage Interoperation Center (SSIC)
3.8 Back-end storage subsystem configuration
Back-end storage subsystem configuration must be planned for all storage controllers that are attached to the Storwize V7000.
For more information about supported storage subsystems, see these websites:
IBM Support Information for Storwize V7000:
IBM System Storage Interoperation Center (SSIC):
Apply the following general guidelines for back-end storage subsystem configuration planning:
In the SAN, storage controllers that are used by the Storwize V7000 clustered system must be connected through SAN switches. Direct connection between the Storwize V7000 and the storage controller is not supported.
Enhanced Stretched Cluster configurations have additional requirements and configuration guidelines. For more information about performance and preferred practices for the Storwize V7000, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521.
 
MDisks within storage pools: Software versions 6.1 and later provide for better load distribution across paths within storage pools.
In previous code levels, the path to MDisk assignment was made in a round-robin fashion across all MDisks that are configured to the clustered system. With that method, no attention is paid to how MDisks within storage pools are distributed across paths. Therefore, it was possible and even likely that certain paths were more heavily loaded than others.
Starting with software version 6.1, the code contains logic that takes into account which MDisks are provided by which back-end storage systems. Therefore, the code more effectively distributes active paths based on the storage controller ports that are available.
The Detect MDisk commands must be run following the creation or modification (addition of or removal of MDisk) of storage pools for paths to be redistributed.
If your back-end storage system does not support the Storwize V7000 round-robin algorithm, ensure that the number of MDisks per storage pool is a multiple of the number of storage ports that are available. This approach ensures sufficient bandwidth for the storage controller, and an even balance across storage controller ports.
In general, configure disk subsystems as though Storwize V7000 was not used. However, there might be specific requirements or limitations as to the features usable in the given back-end storage system when it is attached to Storwize V7000. Review the appropriate section of documentation to verify that your back-end storage is supported and to check for any special requirements:
Generally, observe these general rules:
Disk drives:
 – Exercise caution with use of large hard disk drives so that you do not have too few spindles to handle the load.
Array sizes:
 – Storwize V7000 will not queue more than 60 I/O operations per MDisk. Therefore, make sure that the MDisks presented to Storwize V7000 can handle about this many requests, which corresponds to about 8 HDDs. If your array can handle higher load, split it in several LUNs of equal size to better match back-end storage capabilities with the load that Storwize V7000 can generate.
See IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521, for an in-depth discussion of back-end storage LUN presentation to Storwize V7000.
 – Since version 7.3, the system uses autobalancing to restripe volume extents evenly across all MDisks in the storage pools.
 – The cluster can be connected to a maximum of 1024 WWNNs. The general practice is that:
 • EMC DMX/SYMM, all HDS, and SUN/HP HDS clones use one WWNN per port. Each port appears as a separate controller to the Storwize V7000.
 • IBM, EMC CLARiiON, and HP use one WWNN per subsystem. Each port appears as a part of a subsystem with multiple ports, up to a maximum of 16 ports (WWPNs) per WWNN.
However, if you plan configuration that might be limited by the WWNN maximum, verify the WWNN versus WWPN policy with the back-end storage vendor.
3.9 Internal storage configuration
For general-purpose storage pools with various I/O applications, use the storage configuration wizard in the GUI. For specific applications with known I/O patterns, use CLI to create arrays that suits your needs.
Use DRAID 6 as the default RAID level for internal arrays. DRAID 6 outperforms RAID 5. Workloads that used to be placed on RAID 10 arrays now are usually on SSD drives.
For in-depth discussion of internal storage configuration, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521.
3.10 Storage pool configuration
The storage pool is at the center of the many-to-many relationship between the MDisks and the volumes. It acts as a container of physical disk capacity from which chunks of MDisk space, known as extents, are allocated to form volumes presented to hosts.
MDisks in the Storwize V7000 are LUNs that are assigned from the back-end storage subsystems to the Storwize V7000. There are two classes of MDisks: Managed and unmanaged. An unmanaged MDisk is a LUN that is presented to SVC by back-end storage, but is not assigned to any storage pool. A managed MDisk is an MDisk that is assigned to a storage pool. An MDisk can be assigned only to a single storage pool.
Storwize V7000 clustered system must have exclusive access to every LUN (MDisk) it is using. Any specific LUN cannot be presented to more than one Storwize V7000 cluster. Also, presenting the same LUN to a Storwize V7000 and a host is not allowed.
One of basic storage pool parameters is the extent size. All MDisks in the storage pool have the same extent size, and all volumes that are allocated from the storage pool inherit its extent size.
There are two implications of a storage pool extent size:
Maximum volume, MDisk, and managed storage capacity depend on extent size (see http://www.ibm.com/support/docview.wss?uid=ssg1S1010644). The bigger extent that is defined for the specific pool, the larger is the maximum size of this pool, the maximum MDisk size in the pool, and the maximum size of a volume created in the pool.
Volume sizes must be a multiple of the extent size of the pool in which the volume is defined. Therefore, the smaller the extend size, the better control over volume size.
The Storwize V7000 supports extent sizes from 16 mebibytes (MiB) to 8192 MiB. The extent size is a property of the storage pool and is set when the storage pool is created.
The extent size of a storage pool cannot be changed. If you need to change extent size, the storage pool must be deleted and a new storage pool configured.
Table 3-2 lists all of the available extent sizes in a Storwize V7000 and the maximum managed storage capacity for each extent size.
Table 3-2 Extent size and total storage capacities per system
Extent size (MiB)
Total storage capacity manageable per system
0016
64 tebibytes (TiB)
0032
128 TiB
0064
256 TiB
0128
512 TiB
0256
1 pebibytes (PiB)
0512
2 PiB
1024
4 PiB
2048
8 PiB
4096
16 PiB
8192
32 PiB
When planning storage pool layout, consider the following aspects:
Pool extent size:
 – Generally, use 128 MiB or 256 MiB. The IBM Storage Performance Council (SPC) benchmarks use a 256 MiB extent.
 – Pick the extent size and then use that size for all storage pools.
 – You cannot migrate volumes between storage pools with different extent sizes. However, you can use volume mirroring to create copies between storage pools with different extent sizes.
Storage pool reliability, availability, and serviceability (RAS) considerations:
 – The number and size of storage pools affects system availability. Using larger number of smaller pools reduces the failure domain in case one of pools goes offline. However, increased number of storage pools introduces management overhead, impacts storage space use efficiency, and is subject to the configuration maximum limit.
 – An alternative approach is to create few large storage pools. All MDisks that constitute each of the pools should have the same performance characteristics.
 – The storage pool goes offline if an MDisk is unavailable, even if the MDisk has no data on it. Do not put MDisks into a storage pool until they are needed.
 – Put image mode volumes in a dedicated storage pool or pools.
Storage pool performance considerations:
 – It might make sense to create multiple storage pools if you are attempting to isolate workloads to separate disk drives.
 – Create storage pools out of MDisks with similar performance. This technique is the only way to ensure consistent performance characteristics of volumes created from the pool.
The above rule does not apply when you consciously place MDisks from different storage tiers in the pool with the intent to use Easy Tier to dynamically manage workload placement on drives with appropriate performance characteristics.
3.10.1 The storage pool and Storwize V7000 cache relationship
The Storwize V7000 uses cache partitioning to limit the potential negative effects that a poorly performing storage controller can have on the clustered system. The cache partition allocation size is based on the number of configured storage pools. This design protects against individual overloaded back-end storage system from filling system write cache and degrading the performance of the other storage pools. For more information, see Chapter 2, “System overview” on page 11.
Table 3-3 shows the limit of the write-cache data that can be used by a single storage pool.
Table 3-3 Limit of the cache data
Number of storage pools
Upper limit
1
100%
2
066%
3
040%
4
030%
5 or more
025%
No single partition can occupy more than its upper limit of write cache capacity. When the maximum cache size is allocated to the pool, the Storwize V7000 starts to limit incoming write I/Os for volumes that are created from the storage pool. That is, the host writes are limited to the destage rate, on a one-out-one-in basis.
Only writes that target the affected storage pool are limited. The read I/O requests for the throttled pool continue to be serviced normally. However, because the Storwize V7000 is destaging data at a maximum rate that the back-end storage can sustain, read response times are expected to be affected.
All I/O that is destined for other (non-throttled) storage pools continues as normal.
3.11 Volume configuration
When planning a volume, consider the required performance, availability, and cost of storage backing that volume. Volume characteristics are defined by the storage pool in which it is created.
Every volume is assigned to an I/O Group that defines which pair of Storwize V7000 nodes will service I/O requests to the volume.
 
Important: No fixed relationship exists between I/O Groups and storage pools.
Strive to distribute volumes evenly across available I/O Groups and nodes within the clustered system. Although volume characteristics depend on the storage pool from which it is created, any volume can be assigned to any node.
When you create a volume, it is associated with one node of an I/O Group, the preferred access node. By default, when you create a volume, it is associated with the I/O Group node by using a round-robin algorithm. However, you can manually specify the preferred access node if needed.
No matter how many paths are defined between the host and the volume, all I/O traffic is serviced by only one node (the preferred access node).
If you plan to use volume mirroring, for maximum availability put each copy in a different storage pool backed by different back-end storage subsystems. However, depending on your needs it might be sufficient to use a different set of physical drives, a different storage controller, or a different back-end storage for each volume copy. Strive to place all volume copies in storage pools with similar performance characteristics. Otherwise, the volume performance as perceived by the host might be limited by the performance of the slowest storage pool.
3.11.1 Planning for image mode volumes
Use image mode volumes to present to hosts data written to the back-end storage before it was virtualized. An image mode volume directly corresponds to the MDisk from which it is created. Therefore, volume logical block address (LBA) x = MDisk LBA x. The capacity of image mode volumes is equal to the capacity of the MDisk from which it is created.
Image mode volumes are extremely useful tool in storage migration and when introducing IBM SAN Virtualization Controller to existing storage environment.
3.11.2 Planning for thin-provisioned volumes
A thin-provisioned volume has a virtual capacity and a real capacity. Virtual capacity is the volume storage capacity that a host sees as available. Real capacity is the actual storage capacity that is allocated to a volume copy from a storage pool. Real capacity limits the amount of data that can be written to a thin-provisioned volume.
When planning use of thin-provisioned volumes, consider expected usage patterns for the volume. In particular, the actual size of the data and the rate of data change.
Thin-provisioned volumes require more I/Os because of directory accesses. For fully random access, and a workload with 70% reads and 30% writes, a thin-provisioned volume requires approximately one directory I/O for every user I/O. Additionally, thin-provisioned volumes require more processor processing, so the performance per I/O Group can also be reduced.
However, the directory is two-way write-back-cached (as with the Storwize V7000 fastwrite cache), so certain applications perform better.
Additionally, ability to thin-provision volumes can be a worthwhile tool allowing hosts to see storage space significantly larger than what is actually allocated within the storage pool. Thin provisioning can also simplify storage allocation management. You can define virtual capacity of a thinly provisioned volume to an application based on the future requirements, but allocate real storage based on today’s use.
Two types of thin-provisioned volumes are available:
Autoexpand volumes allocate real capacity from a storage pool on demand, minimizing required user intervention. However, a malfunctioning application can cause a volume to expand until its real capacity is equal to the virtual capacity, which potentially can starve other thin provisioned volumes in the pool.
Non-autoexpand volumes have a fixed amount of assigned real capacity. In this case, the user must monitor the volume and assign more capacity when required. Although it prevents starving other thin provisioned volumes, it introduces a risk of an unplanned outage. A thin-provisioned volume will go offline if host tries to write more data than what can fit into the allocated real capacity.
The main risk that is associated with using thin-provisioned volumes is running out of real capacity in the storage volumes, pool, or both, and the resulting unplanned outage. Therefore, strict monitoring of the used capacity on all non-autoexpand volumes and monitoring of the free space in the storage pool is required.
When you configure a thin-provisioned volume, you can define a warning level attribute to generate a warning event when the used real capacity exceeds a specified amount or percentage of the total virtual capacity. You can also use the warning event to trigger other actions, such as taking low-priority applications offline or migrating data into other storage pools.
If a thin-provisioned volume does not have enough real capacity for a write operation, the volume is taken offline and an error is logged (error code 1865, event ID 060001). Access to the thin-provisioned volume is restored by increasing the real capacity of the volume, which might require increasing the size of the storage pool from which it is allocated. Until this time, the data is held in the Storwize V7000 cache. Although in principle this situation is not a data integrity or data loss issue, you must not rely on the Storwize V7000 cache as a backup storage mechanism.
Space is not allocated on a thin-provisioned volume if an incoming host write operation contains all zeros.
 
Important: Set and monitor a warning level on the used capacity so that you have adequate time to respond and provision more physical capacity.
Warnings must not be ignored by an administrator.
Consider using the autoexpand feature of the thin-provisioned volumes to reduce human intervention required to maintain access to thin-provisioned volumes.
When you create a thin-provisioned volume, you can choose the grain size for allocating space in 32 kibibytes (KiB), 64 KiB, 128 KiB, or 256 KiB chunks. The grain size that you select affects the maximum virtual capacity for the thin-provisioned volume. The default grain size is 256 KiB, which is the preferred option. If you select 32 KiB for the grain size, the volume size cannot exceed 260,000 gibibytes (GiB). The grain size cannot be changed after the thin-provisioned volume is created.
Generally, smaller grain sizes save space, but require more metadata access, which can adversely affect performance. If you are not going to use the thin-provisioned volume as a FlashCopy source or target volume, use 256 KiB to maximize performance. If you are going to use the thin-provisioned volume as a FlashCopy source or target volume, specify the same grain size for the volume and for the FlashCopy function. In this situation ideally grain size should be equal to the typical I/O size from the host.
A thin-provisioned volume feature that is called zero detect provides clients with the ability to reclaim unused allocated disk space (zeros) when they are converting a fully allocated volume to a thin-provisioned volume by using volume mirroring.
3.12 Host attachment planning
The typical FC host attachment to the Storwize V7000 is done through SAN fabric. However, the system allows direct attachment connectivity between its 8 Gb or 16 Gb Fibre Channel ports and host ports. No special configuration is required for host systems that are using this configuration. However, the maximum number of directly attached hosts is severely limited by the number of FC ports on Storwize V7000’s nodes.
The Storwize V7000 imposes no particular limit on the actual distance between the Storwize V7000 nodes and host servers. However, for host attachment, the Storwize V7000 supports up to three ISL hops in the fabric. This capacity means that the server to the Storwize V7000 can be separated by up to five FC links, four of which can be 10 km long (6.2 miles) if long wave Small Form-factor Pluggables (SFPs) are used.
Figure 3-9 shows an example of a supported configuration with Storwize V7000 nodes using shortwave SFPs.
Figure 3-9 Example of host connectivity
In Figure 3-9, the optical distance between Storwize V7000 Node 1 and Host 2 is slightly over 40 km (24.85 miles).
To avoid latencies that lead to degraded performance, avoid ISL hops whenever possible. In an optimal setup, the servers connect to the same SAN switch as the Storwize V7000 nodes.
 
Note: Before attaching host systems to Storwize V7000, review the Configuration Limits and Restrictions for the IBM System Storage Storwize V7000 at:
3.12.1 Queue depth
Typically hosts issue subsequent I/O requests to storage systems without waiting for completion of previous ones. The number of outstanding requests is called queue depth. Sending multiple I/O requests in parallel (asynchronous I/O) provides significant performance benefits compared to sending them one-by-one (synchronous I/O). However, if the number of queued requests exceeds the maximum supported by the storage controller, you will experience performance degradation.
Therefore, for large storage networks you should plan for setting correct SCSI commands queue depth on your hosts. For this purpose, a large storage network is defined as one that contains at least 1000 volume mappings. For example, a deployment with 50 hosts with 20 volumes mapped to each of them would be considered a large storage network. For details of the queue depth calculations, see this website:
3.12.2 Offloaded data transfer
If your Windows hosts are configured to use Microsoft Offloaded Data Transfer (ODX) to offload the copy workload to the storage controller, then consider benefits of this technology against additional load on storage controllers. Both benefits and impact of enabling ODX are especially prominent in Microsoft Hyper-V environments with ODX enabled.
3.13 Host mapping and LUN masking
Host mapping is similar in concept to LUN mapping or masking. LUN mapping is the process of controlling which hosts have access to specific LUs within the disk controllers. LUN mapping is typically done at the storage system level. Host mapping is done at the software level.
LUN masking is usually implemented in the device driver software on each host. The host has visibility of more LUNs than it is intended to use. The device driver software masks the LUNs that are not to be used by this host. After the masking is complete, only some disks are visible to the operating system. The system can support this type of configuration by mapping all volumes to every host object and by using operating system-specific LUN masking technology. However, the default, and preferred, system behavior is to map only those volumes that the host is required to access.
The act of mapping a volume to a host makes the volume accessible to the WWPNs or iSCSI names such as iSCSI qualified names (IQNs) or extended-unique identifiers (EUIs) that are configured in the host object.
3.13.1 Planning for large deployments
Each I/O Group can have up to 512 host objects defined. This limit is the same whether hosts are attached by using FC, iSCSI, or a combination of both. To allow more than 512 hosts to access the storage, you must divide them into groups of 512 hosts or less, and map each group to single I/O Group only. This approach allows you to configure up to 2048 host objects on a system with four I/O Groups (eight nodes).
For best performance, split each host group into two sets. For each set, configure the preferred access node for volumes presented to the host set to one of the I/O Group nodes. This approach helps to evenly distribute load between the I/O Group nodes.
Note that a volume can be mapped only to a host that is associated with the I/O Group to which the volume belongs.
3.14 NPIV planning
For more information, see N-Port Virtualization ID (NPIV) Support in Chapter 8, “Hosts” on page 317.
3.15 Advanced Copy Services
The Storwize V7000 offers the following Advanced Copy Services:
FlashCopy
Metro Mirror
Global Mirror
 
Layers: A property called layer for the clustered system is used when a copy services partnership exists between a Storwize V7000 and an IBM Storwize V7000. There are two layers: Replication and storage. By default, the IBM Storwize V7000 is configured as storage layer. This configuration must be changed by using the chsystem CLI command before you use it to make any copy services partnership with the Storwize V7000.
Figure 3-10 shows an example of replication and storage layers.
Figure 3-10 Replication and storage layer
The system’s layer configuration cannot be changed if there are defined host objects with WWPNs, so ensure that the system layer is correctly planned and configured early during the deployment.
3.15.1 FlashCopy guidelines
When planning to use FlashCopy, observe the following guidelines:
Identify each application that must have a FlashCopy function implemented for its volume.
Identify storage pool or pools that will be used by FlashCopy volumes.
Define which volumes need to use FlashCopy
For each volume define which FlashCopy type best fits your requirements:
 – No copy.
 – Full copy.
 – Thin-Provisioned.
 – Incremental.
Define how many copies you need and the lifetime of each copy.
Estimate the expected data change rate for FlashCopy types other than full copy.
Consider memory allocation for copy services. If you plan to define multiple FlashCopy relationships, you might need to modify the default memory setting. See 11.2.19, “Memory allocation for FlashCopy” on page 502.
Define the grain size that you want to use. When data is copied between volumes, it is copied in units of address space known as grains. The grain size is 64 KB or 256 KB. The FlashCopy bitmap contains one bit for each grain. The bit records whether the associated grain has been split by copying the grain from the source to the target. Larger grain sizes can cause a longer FlashCopy time and a higher space usage in the FlashCopy target volume. The data structure and the source data location can modify those effects.
If the grain is larger than most host writes, this can lead to write amplification on the target system. This increase is because for every write IO to an unsplit grain, the whole grain must be read from the FlashCopy source and copied to the target. Such situation could result in performance degradation.
If using a thin-provisioned volume in a FlashCopy map, for best performance use the same grain size as the map grain size. Additionally, If using a thin-provisioned volume directly with a host system, use a grain size that more closely matches the host IO size.
Define which FlashCopy rate best fits your requirement in terms of the storage performance and the amount of time required to complete the FlashCopy. Table 3-4 shows the relationship of the background copy rate value to the number of grain split attempts per second.
For performance-sensitive configurations, test the performance observed for different settings of grain size and FlashCopy rate in your actual environment before committing a solution to production use. See Table 3-4 for some baseline data.
Table 3-4 Grain splits per second
User percentage
Data copied per second
256 KiB grain per second
64 KiB grain per second
1 - 10
128 KiB
0.5
2
11 - 20
256 KiB
1
4
21 - 30
512 KiB
2
8
31 - 40
1 MiB
4
16
41 - 50
2 MiB
8
32
51 - 60
4 MiB
16
64
61 - 70
8 MiB
32
128
71 - 80
16 MiB
64
256
81 - 90
32 MiB
128
512
91 - 100
64 MiB
256
1024
3.15.2 Combining FlashCopy and Metro Mirror or Global Mirror
Use of FlashCopy in combination with Metro Mirror or Global Mirror is allowed if the following conditions are fulfilled:
A FlashCopy mapping must be in the idle_copied state when its target volume is the secondary volume of a Metro Mirror or Global Mirror relationship.
A FlashCopy mapping cannot be manipulated to change the contents of the target volume of that mapping when the target volume is the primary volume of a Metro Mirror or Global Mirror relationship that is actively mirroring.
The I/O group for the FlashCopy mappings must be the same as the I/O group for the FlashCopy target volume.
3.15.3 Planning for Metro Mirror and Global Mirror
Metro Mirror is a copy service that provides a continuous, synchronous mirror of one volume to a second volume. The systems can be up to 300 kilometers apart. Because the mirror is updated synchronously, no data is lost if the primary system becomes unavailable. Metro Mirror is typically used for disaster-recovery purposes, where it is important to avoid any data loss.
Global Mirror is a copy service that is similar to Metro Mirror, but copies data asynchronously. You do not have to wait for the write to the secondary system to complete. For long distances, performance is improved compared to Metro Mirror. However, if a failure occurs, you might lose data.
Global Mirror uses one of two methods to replicate data. Multicycling Global Mirror is designed to replicate data while adjusting for bandwidth constraints. It is appropriate for environments where it is acceptable to lose a few minutes of data if a failure occurs. For environments with higher bandwidth, non-cycling Global Mirror can be used so that less than a second of data is lost if a failure occurs. Global Mirror also works well when sites are more than 300 kilometers away.
When Storwize V7000 copy services are used, all components in the SAN must sustain the workload that is generated by application hosts and the data replication workload. Otherwise, the system can automatically stop copy services relationships to protect your application hosts from increased response times.
Starting with software version 7.6, you can use the chsystem command to set the maximum replication delay for the system. This value ensures that the single slow write operation does not affect the entire primary site.
You can configure this delay for all relationships or consistency groups that exist on the system by using the maxreplicationdelay parameter on the chsystem command. This value indicates the amount of time (in seconds) that a host write operation can be outstanding before replication is stopped for a relationship on the system. If the system detects a delay in replication on a particular relationship or consistency group, only that relationship or consistency group is stopped.
In systems with many relationships, a single slow relationship can cause delays for the remaining relationships on the system. This setting isolates the potential relationship with delays so that you can investigate the cause of these issues. When the maximum replication delay is reached, the system generates an error message that identifies the relationship that exceeded the maximum replication delay.
To avoid such incidents, consider deployment of a SAN performance monitoring tool, such as IBM Tivoli Storage Productivity Center, to continuously monitor the SAN components for error conditions and performance problems. Use of such tool helps you detect potential issues before they affect your environment.
When planning for use of the data replication services, plan for the following aspects of the solution:
Volumes and consistency groups for copy services
Copy services topology
Choice between Metro Mirror and Global Mirror
Connection type between clusters (FC, FCoE, IP)
Cluster configuration for copy services, including zoning
IBM explicitly tests products for interoperability with the Storwize V7000. For more information about the current list of supported devices, see the IBM System Storage Interoperation Center (SSIC) website:
Volumes and consistency groups
Determine if volumes can be replicated independently. Some applications use multiple volumes and require order of writes to these volumes to be preserved in the remote site. Notable examples of such applications are databases.
If application requires write order to be preserved for the set of volumes that it uses, create a consistency group for these volumes.
Copy services topology
One or more clusters can participate in a copy services relationship. One typical and simple use case is disaster recovery, where one site is active and another performs only a disaster recovery function. In such a case, the solution topology is simple, with one cluster per site and uniform replication direction for all volumes. However, there are multiple other topologies possible, allowing you to design a solution that optimally fits your set of requirements. For examples of valid relationships between systems, see this website:
Global Mirror versus Metro Mirror
Decide, which type of copy service you are going use. This decision should be requirements driven. Metro Mirror allows you to prevent any data loss during a system failure, but has more stringent requirements, especially regarding intercluster link bandwidth and latency, as well as remote site storage performance. Additionally it incurs performance penalty because writes are not confirmed to host until data reception confirmation is received from the remote site.
Because of finite data transfer speeds, this remote write penalty grows with the distance between the sites. A point-to-point dark fiber-based link typically incurs a round-trip latency of 1 ms per 100 km (62.13 miles). Other technologies provide longer round-trip latencies. Inter-site link latency defines the maximum possible distance for any performance level.
Global Mirror allows you to relax constraints on system requirements at the cost of using asynchronous replication, which allows the remote site to lag behind the local site. Choice of the replication type has major impact on all other aspects of the copy services planning.
The use of Global Mirror and Metro Mirror between the same two clustered systems is supported.
If you plan to use copy services to realize some application function (for example, disaster recovery orchestration software), review the requirements of the application you plan to use. Verify that the complete solution is going to fulfill supportability criteria of both IBM and the application vendor.
Intercluster link
The local and remote clusters can be connected by FC, FCoE, or IP network. The IP network can be used as a carrier for FCoIP solution or as a native data carrier.
Each of the technologies has its own requirements concerning supported distance, link speeds, bandwidth, and vulnerability to frame or packet loss. For the most current information regarding requirements and limitations of each of supported technologies, see this website:
The two major parameters of a link are its bandwidth and latency. Latency might limit maximum bandwidth available over IP links, depending on the details of the technology used.
When planning the Intercluster link, take into account the peak performance that is required. This consideration is especially important for Metro Mirror configurations.
When Metro Mirror or Global Mirror is used, a certain amount of bandwidth is required for the IBM Storwize V7000 intercluster heartbeat traffic. The amount of traffic depends on how many nodes are in each of the two clustered systems.
Table 3-5 shows the amount of heartbeat traffic, in megabits per second, that is generated by various sizes of clustered systems. 
Table 3-5 Intersystem heartbeat traffic in Mbps
Storwize V7000 System 1
Storwize V7000 System 2
 
2 nodes
4 nodes
6 nodes
8 nodes
2 nodes
5
06
06
06
4 nodes
6
10
11
12
6 nodes
6
11
16
17
8 nodes
6
12
17
21
These numbers estimate the amount of traffic between the two clustered systems when no I/O is taking place to mirrored volumes. Half of the data is sent by each of the systems. The traffic is divided evenly over all available intercluster links. Therefore, if you have two redundant links, half of this traffic is sent over each link.
The bandwidth between sites must be sized to meet the peak workload requirements. You can estimate the peak workload requirement by measuring the maximum write workload averaged over a period of 1 minute or less, and adding the heartbeat bandwidth. Statistics must be gathered over a typical application I/O workload cycle, which might be days, weeks, or months, depending on the environment in which the Storwize V7000 is used.
When planning the inter-site link, consider also the initial sync and any future resync workloads. It might be worthwhile to secure additional link bandwidth for the initial data synchronization.
If the link between the sites is configured with redundancy so that it can tolerate single failures, you must size the link so that the bandwidth and latency requirements are met even during single failure conditions.
When planning the inter-site link, make a careful note whether it is dedicated to the inter-cluster traffic or is going to be used to carry any other data. Sharing link with other traffic (for example, cross-site IP traffic) might reduce the cost of creating the inter-site connection and improve link utilization. However, doing so might affect the links’ ability to provide the required bandwidth for data replication.
Verify carefully, that the devices that you plan to use to implement the intercluster link are supported.
Cluster configuration
If you configure replication services, you might decide to dedicate ports for intercluster communication, intracluster traffic, or both. In that case, make sure, that your cabling and zoning reflects that decision. Additionally, such dedicated ports are inaccessible for host or back-end storage traffic, so plan your volume mappings as well as hosts and back-end storage connections accordingly.
Global Mirror volumes should have their preferred access nodes evenly distributed between the nodes of the clustered systems. Figure 3-11 shows an example of a correct relationship between volumes in a Metro Mirror or Global Mirror solution.
Figure 3-11 Correct volume relationship
The back-end storage systems at the replication target site must be capable to handle the peak application workload to the replicated volumes, plus the client-defined level of background copy, plus any other I/O being performed at the remote site. The performance of applications at the local clustered system can be limited by the performance of the back-end storage controllers at the remote site. This consideration is especially important for Metro Mirror replication.
A complete review must be performed before Serial Advanced Technology Attachment (SATA) drives are used for any Metro Mirror or Global Mirror replica volumes. If a slower disk subsystem is used as target for the remote volume replicas of high-performance primary volumes, the Storwize V7000 cache might not be able to buffer all the writes. The speed of writes to SATA drives at the remote site might limit the I/O rate at the local site.
To ensure that the back-end storage is able to support the data replication workload, you can dedicate back-end storage systems to only Global Mirror volumes. You can also configure the back-end storage to ensure sufficient quality of service (QoS) for the disks that are used by Global Mirror. Alternatively, you can ensure that physical disks are not shared between data replication volumes and other I/O.
3.16 SAN boot support
The IBM Storwize V7000 supports SAN boot or start-up for AIX, Microsoft Windows Server, and other operating systems. Because SAN boot support can change, check the following website regularly:
3.17 Data migration from a non-virtualized storage subsystem
Data migration is an important part of a Storwize V7000 implementation. Therefore, you must prepare a detailed data migration plan. You might need to migrate your data for one of the following reasons:
To redistribute workload within a clustered system across back-end storage subsystems
To move workload onto newly installed storage
To move workload off old or failing storage, ahead of decommissioning it
To move workload to rebalance a changed load pattern
To migrate data from an older disk subsystem to Storwize V7000-managed storage
To migrate data from one disk subsystem to another disk subsystem
Because multiple data migration methods are available, choose the method that best fits your environment, operating system platform, type of data, and the application’s service level agreement (SLA).
Data migration methods can be divided into three classes:
Based on operating system, for example, using system’s Logical Volume Manager (LVM)
Based on specialized data migration software
Based on the Storwize V7000 data migration features
With data migration, apply the following guidelines:
Choose which data migration method best fits your operating system platform, type of data, and SLA.
Choose where you want to place your data after migration in terms of the storage tier, pools, and back-end storage.
Check whether enough free space is available in the target storage pool.
To minimize downtime during the migration, plan ahead of time all of the required changes, including zoning, host definition, and volume mappings.
Prepare a detailed operation plan so that you do not overlook anything at data migration time. Especially for a large or critical data migration, have the plan peer reviewed and formally accepted by an appropriate technical design authority within your organization.
Perform and verify a backup before you start any data migration.
You might want to use the Storwize V7000 as a data mover to migrate data from a non-virtualized storage subsystem to another non-virtualized storage subsystem. In this case, you might have to add checks that relate to the specific storage subsystem that you want to migrate.
Be careful when you are using slower disk subsystems for the secondary volumes for high-performance primary volumes because the Storwize V7000 cache might not be able to buffer all the writes. Flushing cache writes to slower back-end storage might impact performance of your hosts.
See 11.5, “Volume mirroring and migration options” for more information about data migration using Storwize V7000.
3.18 Storwize V7000 configuration backup procedure
Save the configuration before and after any change to the clustered system, such as adding nodes and back-end storage. Saving the configuration is a crucial part of Storwize V7000 management, and various methods can be applied to back up your Storwize V7000 configuration. The preferred practice is to implement an automatic configuration backup using the configuration backup command. Make sure that you save the configuration to storage that is not dependent on the SAN Virtualization Controller.
3.19 Performance considerations
Storage virtualization with the Storwize V7000 improves flexibility and simplifies management of storage infrastructure, and can provide a substantial performance advantage. The Storwize V7000 caching capability and its ability to stripe volumes across multiple disk arrays are the reasons why usually significant performance improvements are observed when Storwize V7000 is used to virtualize midrange back-end storage subsystems.
 
Tip: Technically, almost all storage controllers provide both striping (in the form of RAID 5, RAID 6, or RAID 10) and a form of caching. The real benefit of Storwize V7000 is the degree to which you can stripe the data across disks in a storage pool, even if they are installed in different back-end storage systems. This technique maximizes the number of active disks available to service I/O requests. The Storwize V7000 provides additional caching, but its impact is secondary for sustained workloads.
To ensure the performance that you want and verify the capacity of your storage infrastructure, undertake a performance and capacity analysis to reveal the business requirements of your storage environment. Use the analysis results and the guidelines in this chapter to design a solution that meets the business requirements of your organization.
When considering performance for a system, always identify the bottleneck and, therefore, the limiting factor of a specific system. This is a multidimensional analysis that needs to be performed for each of your workload patterns. There can be different bottleneck components for different workloads.
When you are designing a storage infrastructure with the Storwize V7000 or implementing a Storwize V7000 in an existing storage infrastructure, you must ensure that the performance and capacity of the SAN, back-end disk subsystems, and Storwize V7000 meets requirements for the set of known or expected workloads.
3.19.1 SAN
The following Storwize V7000 models are supported for V8.1:
Control enclosures:
 – 2076-524
 – 2076-524
Expansion enclosures:
 – 2076-12F
 – 2076-24F
 – 2076-92F
 – 2076-A9F
The control enclosures can connect to 8 Gbps, and 16 Gbps switches SAN switches and, with optional 10 Gbps expansion card, to 10 Gbps Ethernet (for iSCSI traffic) and FCoE switches.
3.19.2 Back-end storage subsystems
When connecting a back-end storage subsystem to IBM Storwize V7000, follow these guidelines:
Connect all storage ports to the switch up to a maximum of 16, and zone them to all of the Storwize V7000 ports.
Zone all ports on the disk back-end storage to all ports on the Storwize V7000 nodes in a clustered system.
Ensure that you configure the storage subsystem LUN-masking settings to map all LUNs that are used by the Storwize V7000 to all the Storwize V7000 WWPNs in the clustered system.
The Storwize V7000 is designed to handle many paths to the back-end storage.
In most cases, the Storwize V7000 can improve performance, especially of mid-sized to low-end disk subsystems, older disk subsystems with slow controllers, or uncached disk systems, for the following reasons:
The Storwize V7000 can stripe across disk arrays, and it can stripe across the entire set of configured physical disk resources.
The Storwize V7000 control enclosure 2076-524 has 32 GB of cache and 2076-624 has 32 GB of cache (upgradeable to 64 GB).
The Storwize V7000 can provide automated performance optimization of hot spots by using flash drives and Easy Tier.
The Storwize V7000 large cache and advanced cache management algorithms also allow it to improve the performance of many types of underlying disk technologies. The Storwize V7000 capability to asynchronously manage destaging operations incurred by writes while maintaining full data integrity has the potential to be important in achieving good database performance.
Because hits to the cache can occur both in the upper (Storwize V7000) and the lower (back-end storage disk controller) level of the overall system, the system as a whole can use the larger amount of cache wherever it is located. Therefore, Storwize V7000 cache provides additional performance benefits also for back-end storage systems with extensive cache banks.
Also, regardless of their relative capacities, both levels of cache tend to play an important role in enabling sequentially organized data to flow smoothly through the system.
However, Storwize V7000 cannot increase the throughput potential of the underlying disks in all cases. Performance benefits depend on the underlying storage technology and the workload characteristics, including the degree to which the workload exhibits hotspots or sensitivity to cache size or cache algorithms.
3.19.3 Storwize V7000
The Storwize V7000 clustered system is scalable up to eight nodes. Its performance grows nearly linearly when more nodes are added, until it becomes limited by other components in the storage infrastructure. Although virtualization with the Storwize V7000 provides a great deal of flexibility, it does not abolish the necessity to have a SAN and back-end storage subsystems that can deliver the performance that you want.
Essentially, Storwize V7000 performance improvements are gained by using in parallel as many physical disks as possible, which creates a greater level of concurrent I/O to the back-end storage without overloading a single disk or array.
Assuming that no bottlenecks exist in the SAN or on the disk subsystem, you must follow specific guidelines when you perform the following tasks:
Creating a storage pool
Creating volumes
Connecting to or configuring hosts that use storage presented by a Storwize V7000 clustered system
For more information about performance and preferred practices for the Storwize V7000, see IBM System Storage SAN Volume Controller and Storwize V7000 Best Practices and Performance Guidelines, SG24-7521.
3.19.4 IBM Real-time Compression
IBM RtC technology in storage systems is based on the Random Access Compression Engine (RACE) technology. It is implemented in IBM Storwize V7000 and the IBM Storwize family, IBM FlashSystem V840 systems, IBM FlashSystem V9000 systems, and IBM XIV (IBM Spectrum Accelerate™). This technology can play a key role in storage capacity savings and investment protection.
Although the technology is easy to implement and manage, it is helpful to understand the basics of internal processes and I/O workflow to ensure a successful implementation of any storage solution.
The following are some general suggestions:
Best results can be achieved if the data compression ratio stays at 25% or above. Volumes can be scanned with the built-in Comprestimator utility to support decision if RtC is a good choice for the specific volume.
More concurrency within the workload gives a better result than single-threaded sequential I/O streams.
I/O is de-staged to RACE from the upper cache in 64 KiB pieces. The best results are achieved if the host I/O size does not exceed this size.
Volumes that are used for only one purpose usually have the same work patterns. Mixing database, virtualization, and general-purpose data within the same volume might make the workload inconsistent. These workloads might have no stable I/O size and no specific work pattern, and a below-average compression ratio, making these volumes hard to investigate during performance degradation. Real-time Compression development advises against mixing data types within the same volume whenever possible.
It is best to not recompress pre-compressed data. Volumes with compressed data should stay as uncompressed volumes.
Volumes with encrypted data have a very low compression ratio, and are not good candidates for compression. This observation is true for data encrypted by the host. Real-time Compression can provide satisfactory results for volumes encrypted by Storwize V7000, as compression is performed before encryption.
3.19.5 Performance monitoring
Performance monitoring must be a part of the overall IT environment. For the Storwize V7000 and other IBM storage subsystems, the official IBM tool to collect performance statistics and provide a performance report is IBM Spectrum Control.
For more information about using IBM Spectrum Control to monitor your storage subsystem, see this website:
Also, see IBM Spectrum Family: IBM Spectrum Control Standard Edition, SG24-8321.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.105.255