Planning back-end storage
This chapter describes the aspects and practices to consider when system’s external back-end storage is planned, configured, and managed.
External storage is acquired by Spectrum Virtualize by virtualizing separate IBM or third-party storage systems, which are attached with FC or iSCSI.
 
Note: IBM SAN Volume Controller that is built on SV1 nodes supports SAS-attached expansions with solid-state drives (SSDs) and hard disk drives (HDDs). SV2 and SA2 nodes do not support internal storage; therefore, internal storage management is not covered in this book.
For information about configuring internal storage that is attached to SV1 nodes, see IBM SAN Volume Controller Best Practices and Performance Guidelines for IBM Spectrum Virtualize V8.4.2, SG24-8509.
This chapter includes the following topics:
3.1 General considerations for managing external storage
IBM SAN Volume Controller can virtualize external storage that is presented to the system. External back-end storage systems (or controllers in Spectrum Virtualize terminology) provide their logical volumes (LVs), which are detected by IBM SAN Volume Controller as MDisks and can be used in storage pools.
This section covers aspects of planning and managing external storage that is virtualized by IBM SAN Volume Controller.
External back-end storage can be connected to IBM SAN Volume Controller with FC (SCSI) or iSCSI. NVMe-FC back-end attachment is not supported because it provides no performance benefits for IBM SAN Volume Controller. The main advantage of NVMe solution is seen as a reduction in CPU cycles that are needed on a host level to handle the interrupts from Fibre Channel HBAs.
For external back-end controllers, IBM SAN Volume Controller acts as a host. All Spectrum Virtualize Fibre Channel drivers are implemented from day one as polling drivers, not interrupt-driven drivers. Therefore, almost no latency savings are gained on the IBM SAN Volume Controller side by switching from SCSI to NVMe as a protocol.
3.1.1 Storage controller path selection
When a managed disk (MDisk) logical unit (LU) is accessible through multiple IBM SAN Volume Controller ports, the system ensures that all nodes that access this LU coordinate their activity and access the LU through the same storage system port.
An MDisk path that is presented to the IBM SAN Volume Controller for all system nodes must meet the following criteria:
The system node:
 – is a member of an IBM SAN Volume Controller cluster
 – Has Fibre Channel or iSCSI connections to the storage system port
 – Is successfully discovered the LU
The port selection process has not caused the system node to exclude access to the MDisk through the storage system port.
When the IBM SAN Volume Controller nodes select a set of ports to access the storage system, the two types of path selection that are described in the next sections are supported to access the MDisks. A type of path selection is determined by external system type and cannot be changed.
For more information about which algorithm is used for a specific back-end system, see System Storage Interoperation Center (SSIC), as shown in Figure 3-1.
Figure 3-1 SSIC example
Round-robin path algorithm
With the round-robin path algorithm, each MDisk uses one path per target port per IBM SAN Volume Controller node. Therefore, in cases of storage systems that do not feature a preferred controller (such as XIV or DS8000), each MDisk uses all of the available FC ports of that storage controller.
With a round-robin compatible storage controller, there is no need to create as many volumes as there are storage FC ports. Every volume, and therefore MDisk, uses all available IBM SAN Volume Controller ports.
This configuration results in a significant performance increase because the MDisk is no longer bound to one back-end FC port. Instead, it can issue I/Os to many back-end FC ports in parallel. Particularly, the sequential I/O within a single extent can benefit from this feature.
Also, the round-robin path selection improves resilience to specific storage system failures. For example, if one of the back-end storage system FC ports encounters some performance problems, the I/O to MDisks is sent through other ports. Moreover, because I/Os to MDisks are sent through all back-end storage FC ports, the port failure can be detected more quickly.
 
Preferred practice: If your storage system supports the round-robin path algorithm, zone as many FC ports from the back-end storage controller as possible. IBM SAN Volume Controller supports up to 16 FC ports per storage controller. For more information about FC port connections and zoning guidelines, see your storage system documentation.
Example 3-1 shows a storage controller that supports round-robin path selection.
Example 3-1 Round robin enabled storage controller
IBM_2145:SVC-ITSO:superuser>lsmdisk 4
id 4
name mdisk4
...
preferred_WWPN 20010002AA0244DA
active_WWPN many <<< Round Robin Enabled
MDisk group balanced and controller balanced
Although round-robin path selection provides optimized and balanced performance with minimum configuration required, some storage systems still require manual intervention to achieve the same goal.
With storage subsystems with active-passive balanced path selection, IBM SAN Volume Controller accesses an MDisk LU through one of the ports on the preferred controller. To best use the back-end storage, it is important to ensure that the number of LUs that is created is a multiple of the connected FC ports and aggregate all LUs to a single MDisk group.
Example 3-2 shows a storage controller that supports MDisk group balanced path selection.
Example 3-2 MDisk group balanced path selection (no round robin enabled) storage controller
IBM_2145:SVC-ITSO:superuser>lsmdisk 5
id 5
name mdisk5
...
preferred_WWPN
active_WWPN 20110002AC00C202 <<< indicates MDisk group balancing
3.1.2 Guidelines for creating optimal back-end configuration
Most of the back-end controllers aggregate spinning or SSDs into RAID arrays, then join arrays into pools. Logical volumes are created on those pools and provided to hosts.
When connected to external back-end storage, IBM SAN Volume Controller acts as a host. It is important to create back-end controller configuration that provides performance and resiliency because IBM SAN Volume Controller relies on back-end storage when serving I/O to attached host systems.
If your back-end system includes homogeneous storage, create the required number of RAID arrays (usually RAID 6 or RAID 10 are recommended) with an equal number of drives. The type and geometry of array depends on the back-end controller vendor’s recommendations. If your back-end controller can spread the load stripe across multiple arrays in a resource pool (for example, by striping), create a single pool and add all arrays there.
On back-end systems with mixed drives, create a separate resource pool for each drive technology (and keep drive technology type in mind because you must assign the correct tier for an MDisk when it is used by IBM SAN Volume Controller).
Create a set of fully allocated logical volumes from the back-end system storage pool (or pools). Each volume is detected as MDisk on IBM SAN Volume Controller. The number of logical volumes to create depends the type of drives that are used by your back-end controller.
Back-end controller with spinning drives
If your backend uses spinning drives, volume number calculation must be based on a queue depth. Queue depth is the number of outstanding I/O requests of a device.
For optimal performance, spinning drives need 8 - 10 concurrent I/O at the device, and this need does not change with drive rotation speed. Therefore, we want to ensure that in a highly loaded system, any IBM SAN Volume Controller MDisk can queue up approximately 8 I/O per back-end system drive.
IBM SAN Volume Controller queue depth per MDisk is approximately 60 (the exact maximum that is seen on a real system can vary, depending on the circumstances; however, for the purpose of this calculation, it does not matter). This queue depth per MDisk number leads to the HDD Rule of 8. According to this rule, to achieve 8 I/O per drive and with queue depth 60 per MDisk from IBM SAN Volume Controller, a back-end array with 60/8 = 7.5 that is approximately equal to 8 physical drives is optimal, or we need one logical volume per every eight drives in an array.
 
Example #1: Back-end controller to be virtualized is IBM FlashSystem 5035 with 64 NL-SAS 8 TB drives.
System is homogeneous. According to recommendations that are presented in the “Array Considerations” section of Implementing the Back-end, SG24-8506, create a single DRAID6 array at Storwize and add a storage pool. By using the HDD rule of 8, we want 64/8 = 8 MDisks; therefore, create 8 volumes from a pool to present to IBM SAN Volume Controller and assign them to a nearline tier.
All-flash back-end controllers
For All-Flash controllers, the considerations are more of I/O distribution across IBM SAN Volume Controller ports and processing threads than of queue depth per drive. Because most all-flash arrays that are put behind the virtualizer include high I/O capabilities, we want to make sure that we are giving IBM SAN Volume Controller the optimal chance to spread the load and evenly make use of its internal resources so that queue depths are less of a concern (because of the lower latency per I/O).
For all-flash backend arrays, IBM recommends creating 32 logical volumes from the array capacity, because keeps the queue depths high enough and spreads the work across the virtualizer resources. For smaller setups with a low number of (SSDs), this number can be reduced to 16 logical volumes (which results in 16 MDisks) or even 8 volumes.
 
 
Example #2: Back-end controllers to be virtualized are IBM FlashSystem 5035 with 24 Tier1 7.6 TB drives and IBM FlashSystem 9200. Virtualizer needs a pool with two storage tiers:
On IBM FlashSystem 5035, create a single DRAID6 array and add it to a storage pool. Using all-flash rule, we need to create 32 volumes to present as MDisks. However, because it is a small setup, we can reduce a number of volumes to 16.
On IBM FlashSystem 9200, join all micro-latency modules into a RAID5 array and add it to a storage pool. FlashSystem 9200 is Tier0 solution; therefore, use the all-flash rule and create 32 volumes to present as MDisks.
On IBM SAN Volume Controller, add 16 MDisks from IBM FlashSystem 5035 as Tier1 flash and 32 MDisks as Tier0 flash to a single multi-tier pool.
Large setup considerations
For controllers, such as IBM DS8000 and XIV, you can use all-flash rule of 32. However, with installations involving such kinds of back-end controllers, it might be necessary to consider a maximum queue depth per back-end controller port, which is set to 1000 for most supported high-end storage systems.
With high-end controllers, queue depth per MDisk can be calculated by using the following formula:
Q = ((P x C) / N) / M
Where:
Q Calculated queue depth for each MDisk
P Number of back-end controller host ports (unique WWPNs) that are zoned to IBM SAN Volume Controller (minimum is 2 and maximum is 16)
C Maximum queue depth per WWPN, which is 1000 for controllers, such as XIV or DS8000
N Number of nodes in the IBM SAN Volume Controller cluster (2, 4, 6, or 8)
M Number of volumes that are presented by back-end controller and detected as MDisks
For a result of Q = 60, calculate the number of volumes needed to create as M = (P x C) / (N x Q), which can be simplified to M = (16 x P) / N.
 
Note: Example #3: 4-node IBM FlashSystem 9200 is used with 12 host ports on the IBM XIV System.
By using this formula, we must create M = (16 x 12) / 4 = 48 volumes on IBM XIV to obtain balanced high-performing configuration.
 
 
3.1.3 Considerations for compressing and deduplicating back-end
IBM SAN Volume Controller supports over-provisioning on selected back-end controllers. Therefore, if back-end storage performs data deduplication or data compression on LUs provisioned from it, they still can be used as external MDisks on IBM SAN Volume Controller.
The implementation steps for thin-provisioned MDisks are the same as for fully allocated storage controllers. Extreme caution should be used when planning capacity for such configurations.
The IBM SAN Volume Controller detects:
Whether the MDisk is thin-provisioned.
The total physical capacity of the MDisk.
The used and remaining physical capacity of the MDisk.
Whether unmap commands are supported by the back-end. By sending SCSI unmap commands to thin-provisioned MDisks, the system marks data that is no longer in use. Then, the garbage-collection processes on the back-end can free unused capacity and reallocate it to free space.
The use of a suitable compression or data deduplication ratio is key to achieving a stable environment. If you are not sure about the real compression or data deduplication ratio, contact your IBM technical sales representative for more information.
The nominal capacity from a compression and deduplication-enabled storage system is not fixed and varies based on the nature of the data. Always use a conservative data reduction ratio for the initial configuration.
The use of an incorrect ratio for capacity assignment can cause an out-of-space situation. If the MDisks do not provide enough capacity, IBM SAN Volume Controller disables access to all the volumes in the storage pool.
 
Example: Consider the following example:
Assumption 1: Sizing is performed with an optimistic 5:1 rate.
Assumption 2: Real rate is 3:1:
 – Physical Capacity: 20 TB.
 – Calculated capacity: 20 TB x 5 = 100 TB.
 – Volume that is assigned from compression- or deduplication-enabled storage subsystem to SAN Volume Controller or Storwize is 100 TB.
 – Real usable capacity: 20 TB x 3 = 60 TB.
If the hosts try to write more than 60 TB data to the storage pool, the storage subsystem cannot provide any more capacity. Also, all volumes that are used as IBM Spectrum Virtualize or Storwize Managed Disks and all related pools go offline.
Thin-provisioned back-end storage must be carefully monitored. Capacity alerts must be set up to be aware of the real remaining physical capacity.
Also, the best practice is to have an emergency plan for “Out Of Physical Space” situation on the back-end controller to know what steps must be taken to recover. The plan also must be prepared during the initial implementation phase.
3.2 Controller-specific considerations
This section discusses implementation information that is related to supported back-end systems. For more information about general requirements, see this IBM Documentation web page.
3.2.1 Considerations for DS8000 series
In this section, we discuss considerations for the DS800 series.
Interaction between DS8000 and IBM SAN Volume Controller
It is important to know DS8000 drive virtualization process; that is, the process of preparing physical drives for storing data that belongs to a volume that is used by a host (in this case, the IBM SAN Volume Controller).
In this regard, the basis for virtualization begins with the physical drives of DS8000, which are mounted in storage enclosures. Virtualization builds upon the physical drives as a series of the following layers:
Array sites
Arrays
Ranks
Extent pools
Logical volumes
Logical subsystems
Array sites are the building blocks that are used to define arrays, which are data storage systems for block-based, file-based, or object based storage. Instead of storing data on a server, storage arrays use multiple drives that are managed by a central management and can store a large amount of data.
In general terms, eight identical drives that have the same capacity, speed, and drive class comprise the array site. When an array is created, the RAID level, array type, and array configuration are defined. RAID 5, RAID 6, and RAID 10 levels are supported.
 
Important: Normally, the RAID 6 is highly preferred and is the default while the Data Storage Graphical Interface (DS GUI) is used. As with large drives in particular, the RAID rebuild times (after one drive failure) get ever larger. The use of RAID 6 reduces the danger of data loss because of a double-RAID failure. For more information, see this IBM Documentation web page.
A rank, which is a logical representation for the physical array, is relevant for IBM SAN Volume Controller because of the creation of a fixed block (FB) pool for each array that you want to virtualize. Ranks in DS8000 are defined in a one-to-one relationship to arrays. It is for this reason that a rank is defined as using only one array.
A fixed-block rank features one of the following extent sizes:
1 GiB (large extent)
16 MiB (small extent)
An extent pool (or storage pool) in DS8000 is a logical construct to add the extents from a set of ranks, which forms a domain for extent allocation to a logical volume.
In synthesis, a logical volume consists of a set of extents from one extent pool or storage pool. DS8900F supports up to 65,280 logical volumes.
A logical volume that is composed of fix block extents is called logical unit number (LUN). A fixed-block LUN consists of one or more 1 GiB extents, or one or more 16 MiB extents from one FB extent pool. A LUN cannot cross extent pools. However, a LUN can have extents from multiple ranks within the same extent pool.
 
Important: DS8000 Copy Services does not support FB logical volumes larger than 2 TiB. Therefore, you cannot create a LUN that is larger than 2 TiB if you want to use Copy Services for the LUN, unless the LUN is integrated as Managed Disks (MDisks) in an IBM SAN Volume Controller. Use IBM Spectrum Virtualize Copy Services instead. Based on the considerations, the following maximum LUN sizes are available to create a DS8900F and present it to IBM SAN Volume Controller:
16 TB LUN with large extents (1 GiB)
16 TB LUN with small extent (16 MiB) for DS8880F with version or edition R8.5 or later, and for DS8900F R9.0 or later
Logical subsystems (or LSS) are another logical construct, and mostly used with fixed block volumes. Thus, 255 LSSs as a maximum can exist on DS8900F. For more information, see this IBM Documentation web page.
The concepts of virtualization of DS8900F for IBM FlashSystem or IBM SAN Volume Controller are shown in Figure 3-2.
Figure 3-2 DS8900 virtualization concepts focus to IBM SAN Volume Controller
Connectivity considerations
The number of DS8000 ports to be used is at least eight. With large and workload intensive configurations, consider using up to 16 ports, which is the maximum that is supported by IBM SAN Volume Controller.
Generally, use ports from different host adapters and, if possible, from different I/O enclosures. This configuration is also important because during a DS8000 LIC update, a host adapter port might need to be taken offline. This configuration allows the IBM SAN Volume Controller I/O to survive a hardware failure on any component on the SAN path.
For more information about SAN preferred practices and connectivity, see Chapter 2, “Storage area network guidelines” on page 21.
Defining storage
To optimize the DS8000 resource usage, use the following guidelines:
Distribute capacity and workload across device adapter pairs.
Balance the ranks and extent pools between the two DS8000 internal servers to support the corresponding workloads on them.
Spread the logical volume workload across the DS8000 internal servers by allocating the volumes equally on rank groups 0 and 1.
Use as many disks as possible. Avoid idle disks, even if all storage capacity is not to be used initially.
Consider the use of multi-rank extent pools.
Stripe your logical volume across several ranks (the default for multi-rank extent pools).
Balancing workload across DS8000 series controllers
When you configure storage on the DS8000 series disk storage subsystem, ensure that ranks on a device adapter (DA) pair are evenly balanced between odd and even extent pools. If you do not ensure that the ranks are balanced, uneven device adapter loading can lead to a considerable performance degradation.
The DS8000 series controllers assign server (controller) affinity to ranks when they are added to an extent pool. Ranks that belong to an even-numbered extent pool have an affinity to server0, and ranks that belong to an odd-numbered extent pool have an affinity to server1.
Figure 3-3 on page 83 shows an example of a configuration that results in a 50% reduction in available bandwidth. Notice how arrays on each of the DA pairs are accessed only by one of the adapters. In this case, all ranks on DA pair 0 are added to even-numbered extent pools, which means that they all have an affinity to server0. Therefore, the adapter in server1 is sitting idle. Because this condition is true for all four DA pairs, only half of the adapters are actively performing work. This condition can also occur on a subset of the configured DA pair.
Figure 3-3 DA pair reduced bandwidth configuration
Example 3-3 shows what this invalid configuration resembles from the CLI output of the lsarray and lsrank commands. The arrays that are on the same DA pair contain the same group number (0 or 1), meaning that they have affinity to the same DS8000 series server. Here, server0 is represented by group0, and server1 is represented by group1.
As an example of this situation, consider arrays A0 and A4, which are attached to DA pair 0. In this example, both arrays are added to an even-numbered extent pool (P0 and P4) so that both ranks have affinity to server0 (represented by group0), which leaves the DA in server1 idle.
Example 3-3 Command output for the lsarray and lsrank commands
dscli> lsarray -l
Date/Time: Oct 20, 2016 12:20:23 AM CEST IBM DSCLI Version: 7.8.1.62 DS: IBM.2107-75L2321
Array State Data RAID type arsite Rank DA Pair DDMcap(10^9B) diskclass
===================================================================================
A0 Assign Normal 5 (6+P+S) S1 R0 0 146.0 ENT
A1 Assign Normal 5 (6+P+S) S9 R1 1 146.0 ENT
A2 Assign Normal 5 (6+P+S) S17 R2 2 146.0 ENT
A3 Assign Normal 5 (6+P+S) S25 R3 3 146.0 ENT
A4 Assign Normal 5 (6+P+S) S2 R4 0 146.0 ENT
A5 Assign Normal 5 (6+P+S) S10 R5 1 146.0 ENT
A6 Assign Normal 5 (6+P+S) S18 R6 2 146.0 ENT
A7 Assign Normal 5 (6+P+S) S26 R7 3 146.0 ENT
 
dscli> lsrank -l
Date/Time: Oct 20, 2016 12:22:05 AM CEST IBM DSCLI Version: 7.8.1.62 DS: IBM.2107-75L2321
ID Group State datastate Array RAIDtype extpoolID extpoolnam stgtype exts usedexts
======================================================================================
R0 0 Normal Normal A0 5 P0 extpool0 fb 779 779
R1 1 Normal Normal A1 5 P1 extpool1 fb 779 779
R2 0 Normal Normal A2 5 P2 extpool2 fb 779 779
R3 1 Normal Normal A3 5 P3 extpool3 fb 779 779
R4 0 Normal Normal A4 5 P4 extpool4 fb 779 779
R5 1 Normal Normal A5 5 P5 extpool5 fb 779 779
R6 0 Normal Normal A6 5 P6 extpool6 fb 779 779
R7 1 Normal Normal A7 5 P7 extpool7 fb 779 779
Figure 3-4 shows a configuration that balances the workload across all four DA pairs.
Figure 3-4 DA pair correct configuration
Figure 3-5 shows what a correct configuration resembles the CLI output of the lsarray and lsrank commands. Notice that the output shows that this configuration balances the workload across all four DA pairs with an even balance between odd and even extent pools. The arrays that are on the same DA pair are split between groups 0 and 1.
Figure 3-5 The lsarray and lsrank command output
DS8000 series ranks to extent pools mapping
In the DS8000 architecture, extent pools are used to manage one or more ranks. An extent pool is visible to both processor complexes in the DS8000 storage system, but it is directly managed by only one of them. You must define a minimum of two extent pools with one extent pool that is created for each processor complex to fully use the resources. The following approaches can be used:
One-to-one approach: One rank per extent pool configuration
With the one-to-one approach, DS8000 is formatted in 1:1 assignment between ranks and extent pools. This configuration disables any DS8000 storage-pool striping or auto-rebalancing activity, if they were enabled. You can create one or two volumes in each extent pool exclusively on one rank only and put all of those volumes into one IBM FlashSystem storage pool. IBM FlashSystem stripes across all of these volumes and balances the load across the RAID ranks by that method. Because no more than two volumes per rank are needed with this approach, the rank size determines the volume size.
Often systems are configured with at least two storage pools:
 – One (or two) that contain MDisks of all the 6+P RAID 5 ranks of the DS8000 storage system
 – One (or more) that contain the slightly larger 7+P RAID 5 ranks
This approach maintains equal load balancing across all ranks when the IBM FlashSystem striping occurs because each MDisk in a storage pool is the same size.
The IBM FlashSystem extent size is the stripe size that is used to stripe across all these single-rank MDisks.
This approach delivered good performance and has its justifications. However, it also includes a few minor drawbacks, including the following examples:
 – A natural skew can occur, such as a small file of a few hundred KiB that is heavily accessed.
 – When you have more than two volumes from one rank, but not as many IBM FlashSystem storage pools, the system might start striping across many entities that are effectively in the same rank, depending on the storage pool layout. Such striping should be avoided.
An advantage of this approach is that it delivers more options for fault isolation and control over where a certain volume and extent are located.
Many-to-one approach: Multi-rank extent pool configuration
A more modern approach is to create several DS8000 extent pools; for example, two DS8000 extent pools. Use DS8000 storage pool striping or automated Easy Tier rebalancing to help prevent overloading individual ranks.
Create at least two extent pools for each tier to balance the extent pools by Tier and Controller affinity. Mixing different tiers on the same extent pool is effective only when Easy Tier is activated on the DS8000 pools. However, when virtualized, tier management has more advantages when handled by the IBM FlashSystem. For more information about choosing the level on which to run Easy Tier, see “External controller tiering considerations” on page 173.
You need only one volume size with this multi-rank approach because enough space is available in each large DS8000 extent pool. The maximum number of back-end storage ports to be presented to the IBM FlashSystem is 16. Each port represents a path to the IBM FlashSystem. Therefore, when sizing the number of LUN and MDisks to be presented to the IBM FlashSystem, the suggestion is to present least 2 - 4 volumes per path. Therefore, to use the maximum of 16 paths, create 32, 48, or 64 DS8000 volumes. IBM FlashSystem maintains a good queue depth for this configuration.
To maintain the highest flexibility and for easier management, large DS8000 extent pools are beneficial. However, if the DS8000 installation is dedicated to shared-nothing environments, such as Oracle ASM, IBM DB2® warehouses, or General Parallel File System (GPFS), use the single-rank extent pools.
LUN masking
For a storage controller, all IBM SAN Volume Controller nodes must detect the same set of LUs from all target ports that logged in. If target ports are visible to the nodes or canisters that do not have the same set of LUs assigned, IBM SAN Volume Controller treats this situation as an error condition and generates error code 1625.
You must validate the LUN masking from the storage controller and then, confirm the correct path count from within the IBM SAN Volume Controller.
The DS8000 series controllers perform LUN masking that is based on the volume group. Example 3-4 shows the output of the showvolgrp command for volume group (V0), which contains 16 LUNs that are being presented to a two-node IBM SAN Volume Controller cluster.
Example 3-4 Output of the showvolgrp command
dscli> showvolgrp V0
Date/Time: Oct 20, 2016 10:33:23 AM BRT IBM DSCLI Version: 7.8.1.62 DS: IBM.2107-75FPX81
Name ITSO_SVC
ID V0
Type SCSI Mask
Vols 1001 1002 1003 1004 1005 1006 1007 1008 1101 1102 1103 1104 1105 1106 1107 1108
Example 3-5 shows output for the lshostconnect command from the DS8000 series. In this example, you can see that four ports of the two-node cluster are assigned to the same volume group (V0) and therefore, are assigned to the same four LUNs.
Example 3-5 Output for the lshostconnect command
dscli> lshostconnect -volgrp v0
Date/Time: Oct 22, 2016 10:45:23 AM BRT IBM DSCLI Version: 7.8.1.62 DS: IBM.2107-75FPX81
Name ID WWPN HostType Profile portgrp volgrpID ESSIOport
=============================================================================================
ITSO_SVC_N1C1P4 0001 500507680C145232 SVC San Volume Controller 1 V0 all
ITSO_SVC_N1C2P3 0002 500507680C235232 SVC San Volume Controller 1 V0 all
ITSO_SVC_N2C1P4 0003 500507680C145231 SVC San Volume Controller 1 V0 all
ITSO_SVC_N2C2P3 0004 500507680C235231 SVC San Volume Controller 1 V0 all
From Example 3-5, you can see that only the IBM SAN Volume Controller WWPNs are assigned to V0.
 
Attention: Data corruption can occur if the same LUN is assigned to IBM SAN Volume Controller nodes and other devices, such as hosts attached to DS8000.
Next, you see how the IBM SAN Volume Controller detects these LUNs if the zoning is properly configured. The Managed Disk Link Count (mdisk_link_count) represents the total number of MDisks that are presented to the IBM SAN Volume Controller cluster by that specific controller.
Example 3-6 shows the general details of the output storage controller by using the system CLI.
Example 3-6 Output of the lscontroller command
IBM_2145:SVC-ITSO:superuser>svcinfo lscontroller DS8K75FPX81
id 1
controller_name DS8K75FPX81
WWNN 5005076305FFC74C
mdisk_link_count 16
max_mdisk_link_count 16
degraded no
vendor_id IBM
product_id_low 2107900
...
WWPN 500507630500C74C
path_count 16
max_path_count 16
WWPN 500507630508C74C
path_count 16
max_path_count 16
IBM SAN Volume Controller MDisks and storage pool considerations
The recommended practice is to create a single IBM SAN Volume Controller storage pool per DS8900F system. This configuration simplifies management, and increases overall performance.
An example of preferred configuration is shown in Figure 3-6. Four Storage pools or Extent pools (one even and one odd) of DS8900F are joined into one IBM SAN Volume Controller storage pool.
Figure 3-6 Four DS8900F extent pools as one IBM SAN Volume Controller storage pool
To determine how many logical volumes must be created to present to IBM SAN Volume Controller as MDisks, see 3.1.2, “Guidelines for creating optimal back-end configuration” on page 76.
3.2.2 IBM XIV Storage System considerations
XIV Gen3 volumes can be provisioned to IBM SAN Volume Controller by way of iSCSI and FC. However, it is preferred to implement FC attachment for performance and stability considerations, unless a dedicated IP infrastructure for storage is available.
Host options and settings for XIV systems
You must use specific settings to identify IBM SAN Volume Controller systems as hosts to XIV systems. An XIV node within an XIV system is a single WWPN. An XIV node is considered to be a single SCSI target. Each host object that is created within the XIV System must be associated with the same LUN map.
From an IBM SAN Volume Controller perspective, an XIV type 281x controller can consist of more than one WWPN. However, all are placed under one worldwide node number (WWNN) that identifies the entire XIV system.
Creating a host object for IBM SAN Volume Controller for an IBM XIV
A single host object with all WWPNs of IBM SAN Volume Controller nodes can be created when implementing IBM XIV. This technique makes the host configuration easier to configure. However, the ideal host definition is to consider each node IBM SAN Volume Controller as a host object, and create a cluster object to include all nodes or canisters.
When implemented in this manner, statistical metrics are more effective because performance can be collected and analyzed on IBM SAN Volume Controller node level.
For more information about creating a host on XIV, see IBM XIV Gen3 with IBM System Storage SAN Volume Controller and Storwize V7000, REDP-5063.
Volume considerations
As modular storage, XIV storage can be presented 6 - 15 modules in a configuration. Each module that is added to the configuration increases the XIV capacity, CPU, memory, and connectivity. The XIV system currently supports the following configurations:
28 - 81 TB when 1 TB drives are used
55 - 161 TB when 2 TB disks are used
84 - 243 TB when 3 TB disks are used
112 - 325 TB when 4 TB disks are used
169 - 489 TB when 6 TB disks are used
Figure 3-7 shows how XIV configuration varies according to the number of modules that are present on the system.
Figure 3-7 XIV rack configuration: 281x-214
Although XIV has its own queue depth characteristics for direct host attachment, the best practices that are described in 3.1.2, “Guidelines for creating optimal back-end configuration” on page 76 are preferred when you virtualize XIV with IBM Spectrum Virtualize.
Table 3-1 shows the suggested volume sizes and quantities for IBM SAN Volume Controller on the XIV systems with different drive capacities.
Table 3-1 XIV minimum volume size and quantity recommendations
Modules
XIV host
ports
Volume size (GB)
1 TB
drives
Volume
size (GB)
2 TB
drives
Volume
size (GB)
3 TB
drives
Volume size (GB) 4 TB
drives
Volume
size (GB)
6 TB drives
Volume
quantity
Volumes
to XIV
host ports
06
04
1600
3201
4852
6401
9791
17
4.3
09
08
1600
3201
4852
6401
9791
27
3.4
10
08
1600
3201
4852
6401
9791
31
3.9
11
10
1600
3201
4852
6401
9791
34
3.4
12
10
1600
3201
4852
6401
9791
39
3.9
13
12
1600
3201
4852
6401
9791
41
3.4
14
12
1600
3201
4852
6401
9791
46
3.8
Other considerations
This section highlights the following restrictions for the use of the XIV system as back-end storage for the IBM SAN Volume Controller:
Volume mapping
When mapping a volume, you must use the same LUN ID to all IBM SAN Volume Controller nodes. Therefore, map the volumes to the cluster, not to individual nodes.
XIV Storage pools
When creating an XIV storage pool, define the Snapshot Size as zero (0). Snapshot space does not need to be reserved because it is not recommended to use XIV snapshots on LUNs mapped as MDisks. The snapshot functions are used on IBM SAN Volume Controller level.
Because all LUNs on a single XIV system share performance and capacity characteristics, use a single IBM SAN Volume Controller storage pool for a single XIV system.
Thin provisioning
XIV thin provisioning pools are not supported by IBM SAN Volume Controller. Instead, you must use a regular pool.
Copy functions for XIV models
You cannot use advanced copy functions for XIV models, such as taking a snapshot and remote mirroring, with disks that are managed by the IBM SAN Volume Controller.
For more information about configuration of XIV behind IBM SAN Volume Controller, see IBM XIV Gen3 with IBM System Storage SAN Volume Controller and Storwize V7000, REDP-5063.
3.2.3 IBM FlashSystem A9000/A9000R considerations
IBM FlashSystem A9000 and IBM FlashSystem A9000R use industry-leading data reduction technology that combines inline, real-time pattern matching and removal, data deduplication, and compression. Compression also uses hardware cards inside each grid controller.
Compression can easily provide a 2:1 data reduction saving rate on its own, which effectively doubles the system storage capacity. Combined with pattern removal and data deduplication services, IBM FlashSystem A9000/A9000R can yield an effective data capacity of five times the original usable physical capacity.
Deduplication can be implemented on the IBM SAN Volume Controller by attaching an IBM FlashSystem A9000/A9000R as external storage instead of the use of IBM Spectrum Virtualize Data Reduction Pool (DRP)-level deduplication.
Next, we describe several considerations when you are attaching an IBM FlashSystem A9000/A9000R system as a back-end controller.
Volume considerations
IBM FlashSystem A9000/A9000R designates resources to data reduction. Because it is always on, it is advised that data reduction be done in the IBM FlashSystem A9000/A9000R only and not in the Spectrum Virtualize cluster. Otherwise, needless extra latency occurs as IBM FlashSystem A9000/A9000R tries to reduce the data.
Estimated data reduction is important because that helps determine volume size. Always attempt to use a conservative data reduction ratio when attaching A9000/A9000R because the storage pool goes offline if the back-end storage runs out of capacity.
To determine the controller volume size:
Calculate effective capacity by reducing the measured data reduction ratio (for example, if the data reduction estimation tool provides a ratio of 4:1, use 3.5:1 for calculations) and multiply it to physical capacity.
Determine the number of connected FC ports by using Table 3-2 and Table 3-3.
Consider that the volume size is equal to effective capacity divided by the number of ports taken twice (effective capacity/path*2)
The remaining usable capacity can be added to the storage pool after the system reaches a stable date reduction ratio.
Table 3-2 Host connections for A9000
Number of controllers
Total FC ports available
Total ports that are connected to SAN Volume Controller
Connected port
3
12
6
All controllers, ports 1 and 3
Table 3-3 Host connections for A9000R
Grid element
Number of controllers
Total FC ports available
Total ports that are connected to SAN Volume Controller
Connected ports
2
04
16
08
All controllers, ports 1 and 3
3
06
24
12
All controllers, ports 1 and 3
4
08
32
08
Controllers 1 - 4, port 1
Controllers 5 - 8, port 3
5
10
40
10
Controllers 1 - 5, port 1
Controllers 6 - 10, port 3
6
12
48
12
Controllers 1 - 6, port 1
Controllers 7 - 12, port 3
It is important not to run out of hard capacity on the back-end storage because the storage pool can go offline. Close monitoring of the FlashSystem A9000/A9000R is important. If you start to run out of space, you can use the migration functions of Spectrum Virtualize to move data to another storage system.
 
Examples: Consider the following examples:
FlashSystem A9000 with 57 TB of usable capacity, or 300 TB of effective capacity, at the standard 5.26:1 data efficiency ratio.
By running the data reduction tool on a good representative sample of the volumes that we are virtualizing, we know that we have a data reduction ratio of 4.2:1 (for extra safety, we use 4:1 for further calculations) and a 4 x 57 results in 228 TB. Divide this by 12 (six paths x 2), and 19 TB are available per volume.
A five grid element FlashSystem A9000R that uses 29 TB Flash enclosures has a total usable capacity of 145 TB.
We use 10 paths and have not run any of the estimation tools on the data. However, we know that the host is not compressing the data. We assume a compression ratio of 2:1, 2 x 145 gives 290, and divided by 20 gives 14.5 TB per volume.
In this case, if we see that we are getting a much better data reduction ratio than we planned for, we can always create volumes and make them available to Spectrum Virtualize.
The biggest concern with the number of volumes is ensuring adequate queue depth is available. Because the maximum volume size on the FlashSystem A9000/A9000R is 1 PB and we are ensuring two volumes per path, we can create a few larger volumes and still have good queue depth and not have numerous volumes to manage.
Other considerations
Spectrum Virtualize can detect that the IBM FlashSystem A9000 controller uses deduplication technology. It also shows that the Deduplication attribute of the managed disk as Active.
Deduplication status is important because it allows IBM Spectrum Virtualize to enforce the following restrictions:
Storage pools with deduplicated MDisks should contain only MDisks from the same IBM FlashSystem A9000 or IBM FlashSystem A9000R storage controller.
Deduplicated MDisks cannot be mixed in an Easy Tier enabled storage pool.
3.2.4 FlashSystem 5000, 5100, 5200, 7200, 9100, and 9200 considerations
Recommendations that are listed in this section apply to a solution with IBM FlashSystem family or IBM Storwize family system is virtualized by IBM SAN Volume Controller system.
Connectivity considerations
It is expected that NPIV is enabled on both systems: the one that is virtualizing storage, and on the one that works as a back-end zone “host” or “virtual” WWPNs of the back-end system to physical WWPNs of the front-end, or virtualizing system.
For more information about SAN and zoning preferred practices, see Chapter 2, “Storage area network guidelines” on page 21.
System layers
Spectrum Virtualize systems feature the concept of system layers. Two layers exist: storage and replication. Systems that are configured into a storage layer can work as a back-end storage. Systems that are configured into replication layer can virtualize another IBM FlashSystem cluster and use them as a back-end controller.
Systems that are configured with the same layer can be replication partners; systems in the different layers cannot.
IBM SAN Volume Controller is configured to replication layer and it cannot be changed.
IBM FlashSystem family systems by default are configured to storage layer. The system layer on IBM FlashSystem can be switched, if needed.
Automatic configuration
IBM FlashSystem family systems that are running code version 8.3x and greater can be automatically configured for optimal performance as a back-end storage behind IBM SAN Volume Controller.
Automatic configuration wizard must be used on a system that has no volumes, pools, and host objects configured. An available wizard configures internal storage devices, creates volumes, and maps the to the host object, which represents the IBM SAN Volume Controller.
Array and disk pool considerations
The back-end IBM FlashSystem family system can have a hybrid configuration that contains FlashCore Modules and SSDs and spinning drives.
Internal storage that is attached to the back-end system must be joined into RAID arrays. You might need one or more DRAID6 arrays, depending on the number and the type of available drives. For more information about RAID recommendations, see the “Array considerations” section in Implementing the IBM FlashSystem with IBM Spectrum Virtualize Version 8.4.2, SG24-8506.
Consider creating a separate disk pool for each type (tier) of storage and use the Easy Tier function on a front-end system. Front-end FlashSystem family systems cannot monitor Easy Tier activity of the back-end storage.
If Easy Tier is enabled on front-end and back-end systems, they independently rebalance the hot areas according to their own heat map. This process causes a rebalance over a rebalance. Such a situation can eliminate the performance benefits of extent reallocation. For this reason, Easy Tier must be enabled on only one level (preferably the front-end). For more information about recommendations for Easy Tier with external storage, see Chapter 4, “Planning storage pools” on page 99.
For most use cases, standard pools are preferred to data reduction pools (DRPs) on the back-end storage. If planned, the front-end performs reduction. Data reduction on both levels is not recommended because it adds processing overhead and does not result in capacity savings.
If Easy Tier is disabled on the back-end as advised, the back-end IBM FlashSystem pool extent size is not a performance concern.
SCSI Unmap considerations
Virtualized IBM FlashSystem treats IBM SAN Volume Controller system as a host. By default, host SCSI Unmap support is enabled on IBM FlashSystem 9100 and IBM FlashSystem 9200, and disabled on other platforms.
Consider enabling host Unmap support to achieve better capacity management if the system that you are going to virtualize contains FCMs or is flash-only (no spinning drives).
Consider leaving host Unmap disabled to protect virtualized system from being over-loaded if you are going to virtualize a hybrid system, and storage to be virtualized uses spinning disks.
To switch host Unmap support on or off, use the chssystem CLI command. For more information, see this IBM Documentation web page.
Volume considerations
Volumes in IBM FlashSystem can be created as striped or sequential. The general rule is to create striped volumes. Volumes on back-end system must be fully allocated.
To determine the number of volumes to create on back-end IBM FlashSystem to provide IBM SAN Volume Controller as MDisks, see the general rules that are described in 3.1.2, “Guidelines for creating optimal back-end configuration” on page 76.
When virtualizing back-end with spinning drives, perform queue depth calculations. For all flash solutions, create 32 volumes from the available pool capacity, which can be reduced to 16 or even 8 for small arrays (for example, if you have 16 or less flash drives in a back-end pool). For FCM arrays, the number of volumes also is governed by load distribution. A total of 32 volumes out of a pool with an FCM array is recommended.
When choosing volume size, consider which system (front-end or back-end) perform compression. If data is compressed and deduplicated on the IBM SAN Volume Controller, FCMs cannot compress it further, which results in a 1:1 compression ratio. Therefore, the back-end volume size must be calculated from the pool physical capacity that is divided by the number of volumes (16 or more).
 
Example: FlashSystem 9200 with 24 x 19.2 TB modules.
This configuration provides raw disk capacity of 460 TB, with 10+P+Q DRAID6 and one distributed spare, physical array capacity is 365 TB or 332 TiB.
Because it is not recommended to provision more than 85% of a physical flash, we have 282 TiB. Because we do not expect any compression on FCM (back-end is getting data that is compressed by upper levels), we provision storage to upper level (assuming 1:1 compression), which means that we create 32 volumes 282TiB / 32 = 8.8 TiB each.
If the IBM SAN Volume Controller is not compressing data, space savings is achieved with FCM hardware compression. Use compression-estimation tools to determine the expected compression ratio and use a smaller ratio for further calculations (for example, if you expect 4.5:1 compression, use 4.3:1). Determine the volume size by using the calculated effective pool capacity.
 
Example: IBM FlashSystem 7200 with 12 x 9.6 TB modules.
IBM FlashSystem 7200 with 12 x 9.6 TB modules. This configuration provides raw disk capacity of 115 TB, with 9+P+Q DRAID6 and one distributed spare, and physical capacity is 85 TB or 78 TiB.
Because it is not recommended to provision more than 85% of a physical flash, we have 66 TiB. Compresstimator shows that we can achieve 3.2:1 compression ratio, decreasing in and assuming 3:1, we have 66 TiB x 3 = 198 TiB of effective capacity.
Create 16 volumes, 198TiB / 16 = 12.4 TiB each. If a compression ratio is higher than expected, we can create and provision to front end more volumes.
3.2.5 IBM FlashSystem 900 considerations
The main advantage of integrating FlashSystem 900 with IBM Spectrum Virtualize is to combine the extreme performance of IBM FlashSystem 900 with the Spectrum Virtualize enterprise-class solution, such as tiering, Volume Mirroring, deduplication, and copy services.
When you configure the IBM FlashSystem 900 as a backend for Spectrum Virtualize family systems, you must remember the considerations that are described next.
Defining storage
IBM FlashSystem 900 supports up to 12 IBM MicroLatency® modules. IBM MicroLatency modules are installed in the IBM FlashSystem 900 based on the following configuration guidelines:
A minimum of four MicroLatency modules must be installed in the system. RAID 5 is the only supported configuration of the IBM FlashSystem 900.
The system supports configurations of 4, 6, 8, 10, and 12 MicroLatency modules in RAID 5.
All MicroLatency modules that are installed in the enclosure must be identical in capacity and type.
For optimal airflow and cooling, if fewer than 12 MicroLatency modules are installed in the enclosure, populate the module bays beginning in the center of the slots and adding on either side until all 12 slots are populated.
The array configuration is performed during system setup. The system automatically creates MDisk/arrays and defines the RAID settings based on the number of flash modules in the system. The default supported RAID level is RAID 5.
Volume considerations
To fully use all Spectrum Virtualize system resources, create 32 volumes (or 16 volumes if FlashSystem 900 is not fully populated). This way, all CPU cores, nodes, and FC ports of the virtualizer are fully used.
However, one important factor must be considered when volumes are created from a pure FlashSystem 900 MDisks storage pool. FlashSystem 900 can process I/Os much faster than traditional storage. In fact, sometimes they are even faster than cache operations because with cache, all I/Os to the volume must be mirrored to another node in I/O group.
This operation can take as much as 1 millisecond while I/Os that are issued directly (which means without cache) to the FlashSystem 900 can take 100 - 200 microseconds. Therefore, it might be recommended to disable Spectrum Virtualize cache to optimize for maximum IOPS in some rare use case.
You must keep the cache enabled in the following situations:
If volumes from FlashSystem 900 pool are:
 – Compressed
 – In a Metro/Global Mirror relationship
 – in a FlashCopy relationship (source or target)
If the same pool has MDisks from FlashSystem 900 also contains MDisks from other back-end controllers.
 
For more information, see Implementing IBM FlashSystem 900, SG24-8271.
3.2.6 Path considerations for third-party storage with EMC VMAX and Hitachi Data Systems
Although many third-party storage options are available and supported, this section highlights the multipathing considerations for EMC VMAX and Hitachi Data Systems (HDS).
When presented to the IBM SAN Volume Controller, most storage controllers are recognized as a single WWNN per controller. However, for some EMC VMAX and HDS storage controller types, the system recognizes each port as a different WWNN. For this reason, each storage port, when zoned to an IBM SAN Volume Controller, appears as a different external storage controller.
IBM Spectrum Virtualize supports a maximum of 16 WWNNs per storage system. Therefore, it is preferred to connect up to 16 storage ports to IBM SAN Volume Controller.
For more information about determining the number of logical volumes or LUNs to be configured on third-party storage, see 3.1.2, “Guidelines for creating optimal back-end configuration” on page 76.
3.3 Quorum disks
 
Note: This section does not cover IP-attached quorum. For information about these quorums, see Chapter 7, “Meeting business continuity requirements” on page 343.
A system uses a quorum disk for the following purposes:
To break a tie when a SAN fault occurs, when half of the nodes that were a member of the system are present.
To hold a copy of important system configuration data.
After internal drives are prepared to be added to an array, or external MDisks become managed, a small portion of its capacity is reserved for quorum data. Its size is less than 0.5 GiB for a drive and not less than one pool extent for an MDisk.
Three devices from all available internal drives and managed MDisks are selected for the quorum disk role. They store system metadata, which is used for cluster recovery after a disaster. Despite only three devices that are designated as quorums, capacity for quorum data is reserved on each of them because the designation might change (for example, if the quorum disk fails).
Only one of those disks is selected as the active quorum disk (it is used as a tie-breaker). If as a result of a failure, the cluster is split in half and both parts lose sight of each other (for example, the inter-site link failed in a HyperSwap cluster with two I/O groups), they appeal to the tie-breaker, active quorum device. The half of the cluster nodes that can reach and reserve the quorum disk after the split occurs, lock the disk and continue to operate. The other half stops its operation. This design prevents both sides from becoming inconsistent with each other.
The storage device must match following criteria to be considered a quorum candidate:
The internal drive or module should be a member of an array or a “Candidate”; drives in “Unused” state cannot be quorums. The MDisk must be in “Managed” state; “Unmanaged” or “Image” MDisks cannot be quorums.
External MDisks cannot be provisioned over iSCSI, only FC.
An MDisk must be presented by a disk subsystem, LUNs from which are supported to be quorum disks.
The system uses the following rules when selecting quorum devices:
Fully connected candidates are preferred over partially connected candidates.
In a multiple enclosure environment, MDisks are preferred over drives.
Drives are preferred over MDisks.
If only one control enclosure and no external storage exist in the cluster, drives are considered first.
Drives from a different control enclosure are to be preferred over a second drive from the same enclosure.
If IBM SAN Volume Controller contains more than one IOgroup, at least one of the candidates from each group is selected.
To become an active quorum device (tie-break device), it must be visible to all nodes in a cluster.
In practice, these rules mean that in a standard topology cluster when you attach at least one back-end storage controller that supports quorum and imported MDisks from it as Managed type, quorums including active quorum disk are assigned automatically. If all your MDisks are image-mode or unmanaged, your cluster operates without quorum device, unless you deployed IP-based quorum.
For more information about quorum device recommendations in a stretched cluster environment, see Chapter 7, “Meeting business continuity requirements” on page 343.
To list IBM SAN Volume Controller quorum devices, run the lsquorum command. To move quorum assignment, run the chquorum command.
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
34.239.150.167