Maintaining storage infrastructure
As an IT environment grows and is renewed, so must the storage infrastructure. Among the many benefits that the IBM SAN Volume Controller family software (IBM Spectrum Virtualize) provides is to greatly simplify the storage management tasks that system administrators must perform.
This chapter highlights guidance for the maintenance activities of storage administration by using the IBM SAN Volume Controller family software that is installed on the product. This guidance can help you to maintain your storage infrastructure with the levels of availability, reliability, and resiliency demanded by today’s applications, and to keep up with storage growth needs.
This chapter concentrates on the most important topics to consider in IBM SAN Volume Controller administration so that you can use it as a checklist. It also provides best practice tips and guidance. To simplify the SAN storage administration tasks that you use often, such as adding users, storage allocation and removal, or adding and removing a host from the SAN, create step-by-step, standard procedures for them.
The discussion in this chapter focuses on the IBM SAN Volume Controller SV1 for the sake of simplicity by using figures and command outputs from this model. The recommendations and practices that are discussed in this chapter are applicable to the following IBM SAN Volume Controller models:
DH8
SV2
SA2
 
Note: The practices that are described here are effective in many installations of different models of the IBM SAN Volume Controller family. These installations were performed in various business sectors for various international organizations. They all had one common need: to manage their storage environment easily, effectively, and reliably.
This chapter includes the following topics:
 
10.1 User interfaces
The IBM SAN Volume Controller family provides several user interfaces that you can use to maintain your system. The interfaces provide different sets of facilities to help resolve situations that you might encounter. The interfaces for servicing your system connect through the 1 Gbps Ethernet ports that are accessible from port 1 of each node.
Consider the following points:
Use the management graphical user interface (GUI) to monitor and maintain the configuration of storage that is associated with your clustered systems.
Use the service assistant tool GUI to complete service procedures.
Use the command-line interface (CLI) to manage your system.
The best practice recommendation is to use the interface that is most suitable to the task you are attempting to complete. For example, a manual software update is best performed by using the service assistant GUI or the CLI. Running fix procedures to resolve problems or configuring expansion enclosures can only be performed by using the management GUI. Creating many volumes with customized names is best performed by way of the CLI by using a script. To ensure efficient storage administration, become familiar with all available user interfaces.
10.1.1 Management GUI
The management GUI is the primary tool that is used to service your system. Regularly monitor the status of the system by using the management GUI. If you suspect a problem, use the management GUI first to diagnose and resolve the problem. Use the views that are available in the management GUI to verify the status of the system, hardware devices, physical storage, and available volumes.
To access the Management GUI, start a supported web browser and point your web browser to https://SVC_ip_address of your system where the SVC_ip_address is the management IP address set when the clustered system is created.
For more information about the task menus and functions of the Management GUI, see Chapter 4 of Implementing the IBM SAN Volume Controller with IBM Spectrum Virtualize Version 8.4.2, SG24-8507.
10.1.2 Service Assistant Tool GUI
The service assistant interface is a browser-based GUI that can be used to service individual nodes.
 
Important: If used incorrectly, the service actions that are available through the service assistant can cause loss of access to data, or even data loss.
You connect to the service assistant on one node through the service IP address. If a working communications path exists between the nodes, you can view status information and perform service tasks on the other node by making the other node the current node. You do not have to reconnect to the other node. On the system, you can also access the service assistant interface by using the technician port.
The service assistant provides facilities only to help you service nodes. Always service the expansion enclosures by using the management GUI.
You can also complete the following actions by using the service assistant:
Collect logs to create and download a package of files to send to support personnel.
Provide detailed status and error summaries.
Remove the data for the system from a node.
Recover a system if it fails.
Install a code package from the support site or rescue the code from another node.
Update code on nodes manually.
Configure a control enclosure chassis after replacement.
Change the service IP address that is assigned to Ethernet port 1 for the current node.
Install a temporary SSH key if a key is not installed and CLI access is required.
Restart the services used by the system.
To access the Service Assistant Tool GUI, start a supported web browser and point your web browser to https://SVC_ip_address/service.
Where SVC_ip_address is the service IP address for the node or the management IP address for the system on which you want to work.
10.1.3 Command-line interface
The system CLI is intended for use by advanced users who are confident using a CLI. Up to 32 simultaneous interactive SSH sessions to the management IP address are supported.
Nearly all the functions that are offered by the CLI also are available through the management GUI. However, the CLI does not provide the fix procedures that are available in the management GUI. However, you can use the CLI when you require a configuration setting that is unavailable in the management GUI.
Entering help in a CLI displays a list of all available commands. You can access other UNIX commands in the restricted shell, such as grep and more, which are useful in formatting the output of the CLI commands. Reverse-i-search (Ctrl+R) is also available. Table 10-1 lists the available commands.
Table 10-1 UNIX commands available in the CLI
UNIX command
Description
grep
Filter output by keywords
more
Moves through output one page at a time
sed
Filters output
sort
Sorts output
cut
Removes individual columns from output
head
Display only first lines
less
Moves through the output one page at a time
tail
Display only last lines
uniq
Hides any duplicates in the output
tr
Translates characters
wc
Counts lines, words and characters in the output
history
Display command history
scp
Secure copy protocol
For more information about command references and syntax, see this IBM Documentation web page.
Service command-line interface
You also can run service CLI commands on a specific node. Log in to the service IP address of the node that requires servicing.
For more information about the use of the service command-line, see this IBM Documentation web page.
USB command interface
When a USB flash drive is inserted into one of the USB ports on a node, the software searches for a control file on the USB flash drive and runs the command that is specified in the file. The use of the USB flash drive is required in the following situations:
When you cannot connect to a node by using the service assistant and you want to see the status of the node.
When you do not know, or cannot use, the service IP address for the node and must set the address.
When you have forgotten the superuser password and must reset the password.
For more information about the use of the USB port, see this IBM Documentation web page.
The technician port is an Ethernet port on the back panel of the IBM SAN Volume Controller system. You can use this port to perform most of the system configuration operations, including the following tasks:
Define a management IP address
Initialize a new system
Service the system
For more information about the use of the Technician port, see IBM Documentation web page.
10.2 Users and groups
Almost all organizations use IT security policies that enforce the use of password-protected user IDs when their IT assets and tools are used. However, some storage administrators still use generic shared IDs (such as superuser, admin, or root) in their management consoles to perform their tasks. They might even use a factory-set default password. Their justification might be a lack of time, forgetfulness, or the fact that their SAN equipment does not support the organization’s authentication tool.
SAN storage equipment management consoles often do not provide direct access to stored data, but a shared storage controller can be easily shut down (accidentally or deliberately) and any number of critical applications along with it.
Moreover, having individual user IDs set for your storage administrators allows changes to be better audited if logs must be analyzed.
IBM SAN Volume Controller supports the following authentication methods:
Local authentication by using a password
Local authentication by using SSH keys
Remote authentication by using Lightweight Directory Access Protocol (LDAP); that is, Microsoft Active Directory or IBM Security Directory Server
Local authentication is suitable for small, single enclosure environments, whereas larger environments with multiple clusters and enclosures benefit from the ease of maintenance that is achieved by using single sign on (SSO) by using remote authentication that uses LDAP, for example.
By default, the following user groups are defined:
Monitor
Users with this role can view objects but cannot manage the system or its resources. Support personnel can be assigned this role to monitor the system and to determine the cause of problems. This role is assigned to the IBM Storage Insights user.
For more information about IBM Storage Insights, see Chapter 9, “Implementing a storage monitoring system” on page 373.
Copy Operator
Users with this role have monitor role privileges and can create, change, and manage all Copy Services functions.
Service
These users can set the time and date on the system, delete dump files, add and delete nodes, apply service, and shut down the system. They also can perform the same tasks as users in the monitor role.
Administrator
Users with this role can access all functions on the system, except those that deal with managing users, user groups, and authentication.
Security Administrator
Users with this role can access all functions on the system, including managing users, user groups, and user authentication.
Restricted Administrator
Users with this role can complete some tasks, but are restricted from deleting specific objects. Support personnel can be assigned this role to solve problems.
3-Site Administrator
Users with this role can configure, manage, and monitor 3-site replication configurations by using specific command operations that are available only on the 3-Site Orchestrator.
vStorage Application Programming Interface (API) for Storage Awareness (VASA) Provider
Users with this role can manage virtual volumes (vVols) that are used by VMware vSphere and managed by using Spectrum Control software.
FlashCopy Administrator
These users use the FlashCopy commands to work with FlashCopy system methods and functions. For more information, see this IBM Documentation web page.
In addition to standard groups, you can configure ownership groups to manage access to resources on the system. An ownership group defines a subset of users and objects within the system. You can create ownership groups to further restrict access to specific resources that are defined in the ownership group.
Users within an ownership group can view or change only resources within the ownership group in which they belong. For example, you can create an ownership group for database administrators to provide monitor-role access to a single pool that is used by their databases. Their views and privileges in the management GUI are automatically restricted, as shown in Figure 10-1.
Figure 10-1 IBM SAN Volume Controller Dashboard for Hardware, Logical and Connectivity view
Regardless of the authentication method you choose, complete the following tasks:
Create individual user IDs for your Storage Administration staff. Choose user IDs that easily identify the user and meet your organization’s security standards.
Include each individual user ID into the UserGroup with only enough privileges to perform the required tasks. For example, your first level support staff likely require only Monitor group access to perform their daily tasks whereas second level support might require Restricted Administrator access. Consider the use of Ownership groups to further restrict privileges.
If required, create generic user IDs for your batch tasks, such as Copy Services or Monitoring. Include them in a Copy Operator or Monitor UserGroup. Never use generic user IDs with the SecurityAdmin privilege in batch tasks.
Create unique SSH public and private keys for each administrator requiring local access.
Store your superuser password in a safe location in accordance with your organization’s security guidelines and use it only in emergencies.
10.3 Volumes
A volume is a logical disk that is presented to a host by an I/O group (pair of nodes). Within that group, a preferred node serves I/O requests to the volume.
When you allocate and deallocate volumes to hosts, consider the following guidelines:
Before you allocate new volumes to a server with redundant disk paths, verify that these paths are working well, and that the multipath software is free of errors. Fix any disk path errors that you find in your server before you proceed.
When you plan for future growth of space efficient volumes (VDisks), determine whether your server’s operating system supports the specific volume to be extended online. For example, AIX V6.1 TL2 and lower do not support online expansion of rootvg LUNs. Test the procedure in a non-production server first.
Always cross-check the host LUN ID information with the vdisk_UID of the IBM SAN Volume Controller. Do not assume that the operating system recognizes, creates, and numbers the disk devices in the same sequence or with the same numbers as you created them in the IBM SAN Volume Controller.
Ensure that you delete any volume or LUN definition in the server before you unmap it in the IBM SAN Volume Controller. For example, in AIX, remove the HDisk from the volume group (reducevg) and delete the associated HDisk device (rmdev).
Consider enabling volume protection by using chsystem vdiskprotectionenabled yes -vdiskprotectiontime <value_in_minutes>. Volume protection ensures that some CLI actions (most of those that explicitly or implicitly remove host-volume mappings or delete volumes) are policed to prevent the removal of mappings to volumes or deletion of volumes that are considered active; the system detected I/O activity to the volume from any host within a specified period (15 - 1440 minutes).
 
Note: Volume protection cannot be overridden by using the -force flag in the affected CLI commands. Volume protection must be disabled to continue an activity that is currently blocked.
Ensure that you specifically remove a volume from any volume-to-host mappings and any copy services relationship to which it belongs before you delete it.
 
Attention: You must avoid the use of the -force parameter in rmvdisk.
If you issue the svctask rmvdisk command and it still has pending mappings, the IBM SAN Volume Controller prompts you to confirm the action. This prompt is a hint that something might be incorrect.
When you are deallocating volumes, plan for an interval between unmapping them to hosts (rmvdiskhostmap) and deleting them (rmvdisk). The IBM internal Storage Technical Quality Review Process (STQRP) asks for a minimum of a 48-hour period, which gives at least a one business day interval so that you can perform a quick back out if you later realize you still need some data on that volume.
For more information about volumes, see Chapter 5, “Volumes types” on page 185.
10.4 Hosts
A host is a computer that is connected to the SAN switch through Fibre Channel (FC), iSCSI and other protocols.
When you add and remove hosts in the IBM SAN Volume Controller, consider the following guidelines:
Before you map new servers to the IBM SAN Volume Controller, verify that they are all error free. Fix any errors that you find in your server and IBM SAN Volume Controller before you proceed. In the IBM SAN Volume Controller, pay special attention to anything that is inactive in the lsfabric command.
Plan for an interval between updating the zoning in each of your redundant SAN fabrics, such as at least 30 minutes. This interval allows for failover to occur and stabilize, and for you to be notified if unexpected errors occur.
After you perform the SAN zoning from one server’s host bus adapter (HBA) to the IBM SAN Volume Controller, you should list its WWPN by using the lshbaportcandidate command. Use the lsfabric command to certify that it was detected by the IBM SAN Volume Controller nodes and ports that you expected.
When you create the host definition in the IBM SAN Volume Controller (mkhost), try to avoid the -force parameter. If you do not see the host’s WWPNs, it might be necessary to scan fabric from the host. For example, use the cfgmgr command in AIX.
For more information about hosts, see Chapter 8, “Configuring host systems” on page 353.
10.5 Software updates
Because the IBM SAN Volume Controller might be at the core of your disk and SAN storage environment, its update requires planning, preparation, and verification. However, with the appropriate precautions, an update can be conducted easily and transparently to your servers and applications. This section highlights applicable guidelines for the IBM SAN Volume Controller update.
Most of the following sections explain how to prepare for the software update. These sections also present version-independent guidelines on how to update the IBM SAN Volume Controller family systems and flash drives.
Before you update the system, ensure that the following requirements are met:
The latest update test utility was downloaded from IBM Fix Central to your management workstation. For more information, see this IBM Fix Central web page.
The latest system update package was downloaded from IBM Fix Central to your management workstation.
All nodes are online.
All errors in the system event log are addressed and marked as fixed.
No volumes, MDisks, or storage systems exist with Degraded or Offline status.
The service assistant IP is configured on every node in the system.
The system superuser password is known.
The current system configuration is backed up and saved (preferably off-site). Use the steps that are described in Example 10-9 on page 489.
Physical access is available to the hardware.
Although the following actions are not required, they are suggestions to reduce unnecessary load on the system during the update:
Stop all Metro Mirror, Global Mirror, or HyperSwap operations.
Avoid running any FlashCopy operations.
Avoid migrating or formatting volumes.
Stop collecting IBM Spectrum Control performance data for the system.
Stop any automated jobs that access the system.
Ensure that no other processes are running on the system.
If you want to update without host I/O, shut down all hosts.
 
Note: For customers who purchased the IBM SAN Volume Controller with a 3 year-warranty (2147 Models SV1, SV2, and SA2), Enterprise Class Support (ECS) is included, which entitles the customer to two code upgrades per year that are performed by IBM (total of six across the 3-year warranty). These upgrades are done by the IBM dedicated Remote Code Load (RCL) team or, where remote support is not allowed or enabled, by an on-site SSR.
For more information about ECS, see this IBM Documentation web page.
10.5.1 Determining the target software level
The first step is to determine your current and target IBM SAN Volume Controller software level.
By using the example of an IBM SAN Volume Controller, log in to the web-based GUI and find the current version. From the right side of the top menu drop-down menu, click the question mark symbol (?) and select About IBM FlashSystem 9200 to display the current version or select Settings  System  Update System to display current and target levels.
Figure 10-2 shows the Update System output panel that displays the code levels. In this example, the current software level is 8.4.0.0.
Figure 10-2 Update System output panel
Alternatively, if you use the CLI, run the svcinfo lssystem command. Example 10-1 shows the output of the lssystem CLI command and where the code level output can be found.
Example 10-1 lssystem command
IBM_2145:IBM Redbook SVC:superuser>lssystem|grep code
code_level 8.4.2.0 (build 152.19.2009101641000)
IBM SAN Volume Controller software levels are specified by four digits in the following format:
In our example (V.R.M.F = 8.4.2.0):
 – V: Major version number
 – R: Release level
 – M: Modification level
 – F: Fix level
Use the latest IBM SAN Volume Controller release, unless you have a specific reason not to update, such as the following examples:
The specific version of an application or other component of your SAN Storage environment has a known problem or limitation.
The latest IBM SAN Volume Controller software release is not yet cross-certified as compatible with another key component of your SAN storage environment.
Your organization has mitigating internal policies, such as the use of the “latest release minus 1” or requiring “seasoning” in the field before implementation in a production environment.
For more information, see this IBM Support web page.
10.5.2 Obtaining software packages
To obtain a new release of software for a system update, see this IBM Fix Central web page. Complete the following steps:
1. From the Product selector list, enter IBM SAN Volume Controller (or whatever model is suitable in your environment).
2. From the Installed Version list, select the current software version level that was determined as described in 10.5.1, “Determining the target software level” on page 442.
3. Select Continue.
4. In the Product Software section, select the three items that are shown in Figure 10-3.
Figure 10-3 Fix Central software packages
5. Select Continue.
6. Select your preferred download options and then, click Continue.
7. Enter your machine type and serial number.
8. Select Continue.
9. Read the terms and conditions and then, select I Agree.
10. Select Download Now and save the three files onto your management computer.
10.5.3 Hardware considerations
Before you start the update process, always check whether your IBM SAN Volume Controller hardware and target code level are compatible.
If part or all your current hardware is not supported at the target code level that you want to update to, replace the unsupported hardware with newer models before you update to the target code level.
Conversely, if you plan to add or replace hardware with new models to an existing cluster, you might need to update your IBM SAN Volume Controller code first.
10.5.4 Update sequence
Check the compatibility of your target IBM SAN Volume Controller code level with all components of your SAN storage environment (SAN switches, storage controllers, server HBAs, on so on) and its attached servers (operating systems and eventually, applications).
Applications often certify only the operating system that they run under and leave to the operating system provider the task of certifying its compatibility with attached components (such as SAN storage). However, various applications might use special hardware features or raw devices and certify the attached SAN storage. If you have this situation, consult the compatibility matrix for your application to certify that your IBM SAN Volume Controller target code level is compatible.
The IBM SAN Volume Controller Supported Hardware List provides the complete information for using your IBM SAN Volume Controller SAN storage environment components with the current and target code level. For more information the supported hardware, device drivers, firmware, and recommended software levels for different products and code levels, see this IBM Support web page.
By cross-checking the version of IBM SAN Volume Controller is compatible with the versions of your SAN environment components, you can determine which one to update first. By checking a component’s update path, you can determine whether that component requires a multi-step update.
If you are not making major version or multi-step updates in any components, the following update order is recommended to avoid eventual problems:
1. SAN switches or directors
2. Storage controllers
3. Servers HBAs microcode and multipath software
4. IBM SAN Volume Controller
5. IBM SAN Volume Controller internal drives
6. IBM SAN Volume Controller SAS attached SSD drives
 
Attention: Do not update two components of your IBM SAN Volume Controller SAN storage environment simultaneously, such as an IBM SAN Volume Controller model SV2 and one storage controller. This caution is true even if you intend to perform this update with your system offline. An update of this type can lead to unpredictable results, and an unexpected problem is much more difficult to debug.
10.5.5 SAN fabrics preparation
If you use symmetrical, redundant, independent SAN fabrics, preparing these fabrics for an IBM SAN Volume Controller update can be safer than hosts or storage controllers. This statement is true, assuming that you follow the guideline of a 30-minute minimum interval between the modifications that you perform in one fabric to the next.
Even if an unexpected error brings down your entire SAN fabric, the IBM SAN Volume Controller environment continues working through the other fabric and your applications remain unaffected.
Because you are updating your IBM SAN Volume Controller, also update your SAN switches code to the latest supported level. Start with your principal core switch or director, continue by updating the other core switches, and update the edge switches last. Update one entire fabric (all switches) before you move to the next one so that any problem you might encounter affects only the first fabric. Begin your other fabric update only after you verify that the first fabric update has no problems.
If you are not running symmetrical, redundant, independent SAN fabrics, fix this problem as a high priority because it represents a single point of failure.
10.5.6 Storage controllers preparation
As critical as with the attached hosts, the attached storage controllers must correctly handle the failover of MDisk paths. Therefore, they must be running supported microcode versions and their own SAN paths to IBM SAN Volume Controller must be free of errors.
10.5.7 Hosts preparation
If the suitable precautions are taken, the IBM SAN Volume Controller update is not apparent to the attached servers and their applications. The automated update procedure updates one IBM SAN Volume Controller node at a time, while the other node in the I/O group covers for its designated volumes.
However, to ensure that this feature works, the failover capability of your multipath software must be working correctly. This capability can be mitigated by enabling NPIV if your current code level supports this function.
For more information about NPIV, see Chapter 8, “Configuring host systems” on page 353.
Before you start IBM SAN Volume Controller update preparation, check the following items for every server that is attached to IBM SAN Volume Controller that you update:
Operating system type, version, and maintenance or fix level
Make, model, and microcode version of the HBAs
Multipath software type, version, and error log
For more information about troubleshooting, see this IBM Documentation web page.
Fix every problem or “suspect” that you find with the disk path failover capability. Because a typical IBM SAN Volume Controller environment can have hundreds of servers attached to it, a spreadsheet might help you with the Attached Hosts Preparation tracking process. If you have some host virtualization, such as VMware ESX, AIX LPARs, IBM VIOS, or Solaris containers in your environment, verify the redundancy and failover capability in these virtualization layers.
10.5.8 Copy services considerations
When you update an IBM SAN Volume Controller family product that participates in an inter-cluster Copy Services relationship, do not update both clusters in the relationship simultaneously. This situation is not verified or monitored by the automatic update process and might lead to a loss of synchronization and unavailability.
You must successfully finish the update in one cluster before you start the next one. Try to update the next cluster as soon as possible to the same code level as the first one. Avoid running them with different code levels for extended periods.
10.5.9 Running the Upgrade Test Utility
The latest IBM SAN Volume Controller Upgrade Test Utility must be installed and run before you update the IBM SAN Volume Controller software. For more information about the Upgrade Test Utility, see this IBM Support web page.
This tool verifies the health of your IBM SAN Volume Controller storage array for the update process. It also checks for unfixed errors, degraded MDisks, inactive fabric connections, configuration conflicts, hardware compatibility, drive firmware, and many other issues that might otherwise require cross-checking a series of command outputs.
 
Note: The Upgrade Test Utility does not log in to storage controllers or SAN switches. Instead, it reports the status of the connections of the IBM SAN Volume Controller to these devices. It is the users’ responsibility to check these components for internal errors.
You can use the management GUI or the CLI to install and run the Upgrade Test Utility.
Using the management GUI
To test the software on the system, complete the following steps:
1. In the management GUI, select Settings  System  Update System.
2. Click Test Only.
3. Select the test utility that you downloaded from the IBM Fix Central support site.
4. Upload the Test utility file and enter the code level to which you are planning to update. Figure 10-4 shows the IBM SAN Volume Controller management GUI window that is used to install and run the Upgrade Test Utility.
Figure 10-4 IBM SAN Volume Controller Upgrade Test Utility using the GUI
5. Click Test.
The test utility verifies that the system is ready to be updated. After the Update Test Utility completes, the results are shown. The results indicate that no warnings or problems were found, or direct you to more information about any known issues that were discovered on the system.
Figure 10-5 shows a successful completion of the update test utility.
Figure 10-5 IBM SAN Volume Controller Upgrade Test Utility completion panel
6. Click Download Results to save the results to a file.
7. Click Close.
Using the command-line
To test the software on the system, complete the following steps:
1. Using OpenSSH scp or PuTTY pscp, copy the software update file and the Software Update Test Utility package to the /home/admin/upgrade directory by using the management IP address of the IBM SAN Volume Controller. Some documentation and online help might refer to the /home/admin/update directory, which points to the same location on the system.
An example for the IBM SAN Volume Controller is shown in Example 10-2.
Example 10-2 Copying the upgrade test utility to IBM SAN Volume Controller
C:>pscp -v -P 22 IBM2145_INSTALL_upgradetest_33.1 [email protected]:/home/admin/upgrade
Looking up host "9.10.11.12" for SSH connection
Connecting to 9.10.11.12 port 22
We claim version: SSH-2.0-PuTTY_Release_0.74
Remote version: SSH-2.0-OpenSSH_8.0
Using SSH protocol version 2
No GSSAPI security context available
Doing ECDH key exchange with curve Curve25519 and hash SHA-256 (unaccelerated)
Server also has ssh-rsa host key, but we don't know it
Host key fingerprint is:
ecdsa-sha2-nistp521 521 d3:00:2b:a0:24:cd:8c:df:3d:d5:d5:07:e5:e5:47:b9
Initialised AES-256 SDCTR (AES-NI accelerated) outbound encryption
Initialised HMAC-SHA-256 (unaccelerated) outbound MAC algorithm
Initialised AES-256 SDCTR (AES-NI accelerated) inbound encryption
Initialised HMAC-SHA-256 (unaccelerated) inbound MAC algorithm
Using username "superuser".
Attempting keyboard-interactive authentication
Keyboard-interactive authentication prompts from server:
| Password:
End of keyboard-interactive prompts from server
Access granted
Opening main session channel
Opened main channel
Primary command failed; attempting fallback
Started a shell/command
Using SCP1
Connected to 9.10.11.12
Sending file IBM2145_INSTALL_upgradetest_33.1, size=335904
Sink: C0644 335904 IBM2145_INSTALL_upgradetest_33.1
IBM2145_INSTALL_upgradete | 328 kB | 328.0 kB/s | ETA: 00:00:00 | 100%
Session sent command exit status 0
Main session channel closed
All channels closed
C:>
2. Ensure that the update file was successfully copied as indicated by the exit status 0 return code. You also can run the lsdumps -prefix /home/admin/upgrade command.
Example 10-3 shows how to install and run Upgrade Test Utility in the CLI. In this case, the Upgrade Test Utility found no errors and completed successfully.
 
Example 10-3 Upgrade test using the CLI
IBM_2145:IBM Redbook SVC:superuser>svctask applysoftware -file IBM2145_INSTALL_upgradetest_33.1
 
CMMVC9001I The package installed successfully.
 
IBM_2145:IBM Redbook SVC:superuser>svcupgradetest -v 8.4.2.0
 
svcupgradetest version 33.1
 
Please wait, the test may take several minutes to complete.
 
Results of running svcupgradetest:
==================================
 
The tool has found 0 errors and 0 warnings.
The tool has not found any problems with the cluster.
 
Note: The return code for the applysoftware command always is 1, whether the installation succeeded or failed. However, the message that is returned when the command completes reports the correct installation result.
Review the output to check whether any problems were found by the utility. The output from the command shows that no problems were found, or directs you to more information about any known issues that were discovered on the system.
10.5.10 Updating software
The SAN Volume Controller software is updated by using one of the following methods:
During a standard update procedure in the management GUI, the system updates each of the nodes systematically. This method is recommended for updating the software that is on nodes.
The command-line interface gives you more control over the automatic upgrade process. You can resolve multipathing issues when nodes go offline for updates. You also can override the default 30-minute mid-point delay, pause an update, and resume a stalled update.
To provide even more flexibility in the update process, you also can manually update each node individually by using the Service Assistant Tool GUI.
When upgrading the software manually, you remove a node from the system, update the software on the node, and return the node to the system. You repeat this process for the remaining nodes until the last node is removed from the system. Then, the remaining nodes switch to running the new software.
When the last node is returned to the system, it updates and runs the new level of software. This action cannot be performed on an active node.
To update software manually, the nodes must be candidate nodes (a candidate node is a node that is not in use by the system and cannot process I/O) or in a service state. During this procedure, every node must be updated to the same software level and the node becomes unavailable during the update.
Whichever method (automatic or manual, GUI or CLI) that you choose to perform the update, ensure that you adhere to the following guidelines for your IBM SAN Volume Controller software update:
Schedule the IBM SAN Volume Controller software update for a low I/O activity time. The update process puts one node at a time offline. It also disables the write cache in the I/O group that node belongs to until both nodes are updated. Therefore, with lower I/O, you are less likely to notice performance degradation during the update.
Never power off, restart, or reset an IBM SAN Volume Controller node during software update unless you are instructed to do so by IBM Support. Typically, if the update process encounters a problem and fails, it backs out. The update process can take one hour per node with another optional, 30-minute mid-point delay.
If you are planning for a major IBM SAN Volume Controller version update, update your current version to its latest fix level before you run the major update.
Check whether you are running a web browser type and version that is supported by the IBM SAN Volume Controller target software level on every computer that you intend to use to manage your IBM SAN Volume Controller.
This section describes the required steps to update the software.
Using the management GUI
To update the software on the system automatically, complete the following steps:
1. In the management GUI, select Settings  System  Update System.
2. Click Test & Update.
3. Select the test utility and the software package that you downloaded from the IBM Fix Central support site. The test utility verifies (again) that the system is ready to be updated.
4. Click Next. Select Automatic update.
5. Select whether you want to create intermittent pauses in the update to verify the process. Select one of the following options.
 – Fully automatic update without pauses (recommended)
 – Pausing the update after half of the nodes are updated
 – Pausing the update before each node updates
6. Click Finish. As the nodes on the system are updated, the management GUI displays the progress for each node.
7. Monitor the update information in the management GUI to determine when the process is complete.
Using the command-line
To update the software on the system automatically, complete the following steps (you must run the latest version of the test utility to verify that no issues exist with the current system; see Example 10-3 on page 449):
1. Copy the software package to the IBM SAN Volume Controller by using the same method as described in Example 10-2 on page 448.
Before you begin the update, consider the following points:
 – The installation process fails under the following conditions:
 • If the software that is installed on the remote system is not compatible with the new software or if an inter-system communication error does not allow the system to check that the code is compatible.
 • If any node in the system has a hardware type that is not supported by the new software.
 • If the system determines that one or more volumes in the system are taken offline by restarting the nodes as part of the update process. More information about which volumes are affected is available by running the lsdependentvdisks command. If you are prepared to lose access to data during the update, you can use the force flag to override this restriction.
 – The update is distributed to all the nodes in the system by using internal connections between the nodes.
 – Nodes are updated individually.
 – Nodes run the new software concurrently with normal system activity.
 – While the node is updated, it does not participate in I/O activity in the I/O group. As a result, all I/O activity for the volumes in the I/O group is directed to the other node in the I/O group by the host multipathing software.
 – A 30-minute delay exists delay between node updates. The delay allows time for the host multipathing software to rediscover paths to the nodes that are updated. Access is not lost when another node in the I/O group is updated.
 – The update is not committed until all nodes in the system are successfully updated to the new software level. If all nodes are successfully restarted with the new software level, the new level is committed. When the new level is committed, the system vital product data (VPD) is updated to reflect the new software level.
 – Wait until all member nodes are updated and the update is committed before you start the new functions of the updated software.
 – Because the update process takes time, the installation command completes when the software level is verified by the system. To determine when the update is completed, you must display the software level in the system VPD or look for the software update complete event in the error/event log. If any node fails to restart with the new software level or fails at any other time during the process, the software level is backed off.
 – During an update, the version number of each node is updated when the software is installed and the node is restarted. The system software version number is updated when the new software level is committed.
 – When the update starts, an entry is made in the error or event log and another entry is made when the update completes or fails.
2. Run the applysoftware -file <software_update_file> CLI command to start the update process.
Where <software_update_file> is the file name of the software update file. If the system identifies any volumes that go offline as a result of restarting the nodes as part of the system update, the software update does not start. An optional force parameter can be used to indicate that the update continues regardless of the problem identified. If you use the force parameter, you are prompted to confirm that you want to continue.
3. Issue the lsupdate CLI command to check the status of the update process.
This command displays a message that indicates that the process was successful when the update is complete.
4. To verify that the update successfully completed, run the lsnodecanistervpd command for each node in the system. The code_level field displays the new code level for each node.
10.6 Drive firmware updates
Updating drive firmware is concurrent process that can be performed online while the drive is in use, whether it is any type of SSD drives in any SAS attached expansion enclosures.
When used on an array member drive, the update checks for volumes that are dependent on the drive and refuses to run if any such dependencies are found. Drive-dependent volumes often are caused by non-redundant or degraded RAID arrays.
Where possible, you should restore redundancy to the system by replacing any failed drives before upgrading drive firmware. When this restoration is not possible, you can add redundancy to the volume by adding a second copy in another pool or use the -force parameter to bypass the dependent volume check. Use the -force parameter only if you are willing to accept the risk of data loss on dependent volumes (if the drive fails during the firmware update).
Note: Because of some system constraints, it is not possible to produce a single NVMe firmware package that works on all NVMe drives on all Spectrum Virtualize code levels. Therefore, you find three different NVMe firmware files available for download, depending on the size of the drives you installed.
Using the management GUI
To update the drive firmware automatically, complete the following steps:
1. Select Pools  Internal Storage  Actions  Upgrade All.
2. As shown in Figure 10-6, in the Upgrade Package window, browse to the drive firmware package that you downloaded as described in 10.5.2, “Obtaining software packages” on page 443.
Figure 10-6 Drive firmware upgrade
3. Click Upgrade. Each drive upgrade takes approximately 6 minutes.
You can also update individual drives by right-clicking a single drive and selecting Upgrade.
4. To monitor the progress of the upgrade, select Monitoring  Background Tasks.
Using the command-line
To manually update the software on the system, complete the following steps:
1. Copy the drive firmware package to the IBM SAN Volume Controller by using the same method as described in Example 10-2 on page 448.
2. Issue the following CLI command to start the update process for all drives:
applydrivesoftware -file <software_update_file> -type firmware -all
Where <software_update_file> is the file name of the software update file. The use of the -all option updates firmware on all eligible drives including quorum drives, which is a slight risk. To avoid this risk, use the -drive option and make sure the quorum is moved by running the lsquorum and chquorum commands in between applydrivesoftware invocations.
 
Note: The maximum number of drive IDs that can be specified on a command line using the -drive option is 128. If you have more than 128 drives, use the -all option or run multiple invocations of applydrivesoftware to complete the update.
3. Issue the following CLI command to check the status of the update process:
lsdriveupgradeprogress
This command displays success when the update is complete.
4. To verify that the update successfully completed, run the lsdrive command for each drive in the system. The firmware_level field displays the new code level for each drive.
Example 10-4 shows how to list the firmware level for four specific drives.
Example 10-4 List firmware level for drives 0,1, 2 and 3
IBM_2145:IBM Redbook SVC:superuser>for i in 0 1 2 3; do echo "Drive $i = `lsdrive $i|grep firmware`"; done
Drive 0 = firmware_level 1_2_11
Drive 1 = firmware_level 1_2_11
Drive 2 = firmware_level 1_2_11
Drive 3 = firmware_level 1_2_11
For more information, see this IBM Documentation web page.
10.7 Remote Code Load
Remote Code Load (RCL) is a service offering that is provided by IBM that allows code updates to be performed by remote support engineers, instead of an on-site IBM Support Services Representative (SSR).
IBM Assist On-site (AOS) or remote support center or Secure Remote Access (SRA), including Call Home enablement, are required to enable RCL. With the Assist on-site enabled, the live remote-assistance tool, which is a member of IBM support team, can view your desktop and share control of your mouse and keyboard to get you on your way to a solution. Rather than the RCL, the tool also can speed up problem determination, data collection, and ultimately, your problem solution.
For more information about configuring support assistance, see this IBM Documentation web page.
Before the Assist On-site application is used, test your connectivity to the Assist On-site network by downloading the IBM connectivity testing tool (see Assist On-site conectivity test).
To request the RCL for your system, go to IBM Remote Code Load web page and select your product type. For more information, see this IBM Support web page.
Complete the following steps:
1. At the IBM Remote Code Load web page, click Product type → Book Now - San Volume Controller - 2145/2147 Remote Code Load.
Figure 10-7 shows the RCL Schedule Service page.
Figure 10-7 FlashSystem RCL Schedule Service page
2. Click Schedule Service to start scheduling the service.
3. Next is the Product type selection for RCL. Go to the SAN Volume Controller - 2147 option and click Select (see Figure 10-8).
Figure 10-8 IBM SAN Volume Controller - RCL Product type page
4. In the RCL time frame option, select the date (see Figure 10-9) and time frame (see Figure 10-10).
Figure 10-9 Time frame selection page
Figure 10-10 RCL Time selection page
5. Enter your booking details. Figure 10-11 shows the RCL booking information form.
Figure 10-11 RCL booking contact information page
10.8 SAN modification
When you administer shared storage environments, human error can occur when a failure is fixed, or a change is made that affects one or more servers or applications. That error can then affect other servers or applications because precautions were not taken.
Human error can include some the following examples:
Disrupting or disabling the working disk paths of a server while trying to fix failed ones.
Disrupting a neighbor SAN switch port while inserting or removing an FC cable or SFP.
Disabling or removing the working part in a redundant set instead of the failed one.
Making modifications that affect both parts of a redundant set without an interval that allows for automatic failover during unexpected problems.
Adhere to the following guidelines to perform these actions with assurance:
Uniquely and correctly identify the components of your SAN.
Use the proper failover commands to disable only the failed parts.
Understand which modifications are necessarily disruptive, and which can be performed online with little or no performance degradation.
10.8.1 Cross-referencing WWPN
With the WWPN of an HBA, you can uniquely identify one server in the SAN. If a server’s name is changed at the operating system level and not at the IBM SAN Volume Controller host definitions, it continues to access its previously mapped volumes exactly because the WWPN of the HBA did not change.
Alternatively, if the HBA of a server is removed and installed in a second server and the first server’s SAN zones and IBM SAN Volume Controller host definitions are not updated, the second server can access volumes that it likely should not access.
Complete the following steps to cross-reference HBA WWPNs:
1. In your server, verify the WWPNs of the HBAs that are used for disk access. Typically, you can complete this task by using the SAN disk multipath software of your server.
If you are using server virtualization, verify the WWPNs in the server that is attached to the SAN, such as AIX VIO or VMware ESX. Cross-reference with the output of the IBM SAN Volume Controller lshost <hostname> command, as shown in Example 10-5.
Example 10-5 Output of the lshost <hostname> command
IBM_2145:IBM Redbook SVC:superuser>svcinfo lshost Server127
id 0
name Server127
port_count 2
type generic
mask 1111111111111111111111111111111111111111111111111111111111111111
iogrp_count 4
status active
site_id
site_name
host_cluster_id
host_cluster_name
protocol scsi
WWPN 10000090FA021A13
node_logged_in_count 1
state active
WWPN 10000090FA021A12
node_logged_in_count 1
state active
2. If necessary, cross-reference information with your SAN switches, as shown in Example 10-6. In Brocade switches use the nodefind <WWPN> command.
Example 10-6 Cross-referencing information with SAN switches
blg32sw1_B64:admin> nodefind 10:00:00:90:FA:02:1A:13
Local:
Type Pid COS PortName NodeName SCR
N 401000; 2,3;10:00:00:90:FA:02:1A:13;20:00:00:90:FA:02:1A:13; 3
Fabric Port Name: 20:10:00:05:1e:04:16:a9
Permanent Port Name: 10:00:00:90:FA:02:1A:13
Device type: Physical Unknown(initiator/target)
Port Index: 16
Share Area: No
Device Shared in Other AD: No
Redirect: No
Partial: No
Aliases: nybixtdb02_fcs0
b32sw1_B64:admin>
For storage allocation requests that are submitted by the server support team or application support team to the storage administration team, always include the server’s HBA WWPNs to which the new LUNs or volumes are supposed to be mapped. For example, a server might use separate HBAs for disk and tape access or distribute its mapped LUNs across different HBAs for performance. You cannot assume that any new volume is supposed to be mapped to every WWPN that server logged in the SAN.
If your organization uses a change management tracking tool, perform all your SAN storage allocations under approved change requests with the servers’ WWPNs listed in the Description and Implementation sections.
10.8.2 Cross-referencing LUN ID
Always cross-reference the IBM SAN Volume Controller vdisk_UID with the server LUN ID before you perform any modifications that involve IBM SAN Volume Controller volumes.
If your organization uses a change management tracking tool, include the vdisk_UID and LUN ID information in every change request that performs SAN storage allocation or reclaim.
 
Note: Because a host can have many volumes with the same scsi_id, always cross-reference the IBM SAN Volume Controller volume UID with the host volume UID and record the scsi_id and LUN ID of that volume.
10.9 Server HBA replacement
Replacing a failed HBA in a server is a fairly trivial and safe operation if it is performed correctly. However, more precautions are required if your server has multiple, redundant HBAs on different SAN fabrics and the server hardware permits you to “hot” replace it (with the server still running).
Complete the following steps to replace a failed HBA and retain the working HBA:
1. In your server, that uses the multipath software, identify the failed HBA and record its WWPNs. For more information, see 10.8.1, “Cross-referencing WWPN” on page 457. Then, place this HBA and its associated paths offline, gracefully if possible. This approach is important so that the multipath software stops trying to recover it. Your server might even show a degraded performance while you perform this task.
Consider the following points:
 – Some HBAs have an external label that shows the WWPNs. If you have this type of label, record the WWPNs before you install the new HBA in the server.
 – If your server does not support HBA hot-swap, power off your system, replace the HBA, connect the used FC cable into the new HBA, and power on the system.
 – If your server supports hot-swap, follow the suitable procedures to perform a “hot” replace of the HBA. Do not disable or disrupt the working HBA in the process.
2. Verify that the new HBA successfully logged in to the SAN switch. If it logged in successfully, you can see its WWPNs logged in to the SAN switch port. Otherwise, fix this issue before you continue to the next step.
Cross-check the WWPNs that you see in the SAN switch with the one you noted in step 1, and ensure that you did not record the incorrect WWNN.
3. In your SAN zoning configuration tool, replace the old HBA WWPNs for the new ones in every alias and zone to which they belong. Do not modify the other SAN fabric (the one with the working HBA) while you perform this task.
Only one alias should use each WWPN, and zones must reference this alias.
If you use SAN port zoning (although you should not be) and you did not move the new HBA FC cable to another SAN switch port, you do not need to reconfigure zoning.
4. Verify that the new HBA’s WWPNs appear in the IBM SAN Volume Controller by running the lsfcportcandidate command.
If the WWPNs of the new HBA do not appear, troubleshoot your SAN connections and zoning.
5. Add the WWPNs of this new HBA in the IBM SAN Volume Controller host definition by running the addhostport command. It is important that you do not remove the old one yet. Run the lshost <servername> command. Then, verify that the working HBA shows as active; the failed HBA should show as inactive or offline.
6. Use software to recognize the new HBA and its associated SAN disk paths. Certify that all SAN LUNs include redundant disk paths through the working HBA and the new HBA.
7. Return to the IBM SAN Volume Controller and verify again (by using the lshost <servername> command) that the working and the new HBA’s WWPNs are active. In this case, you can remove the old HBA WWPNs from the host definition by using the rmhostport command.
8. Do not remove any HBA WWPNs from the host definition until you ensure that you have at least two active ones that are working correctly.
By following these steps, you avoid removing your only working HBA by mistake.
10.10 Hardware upgrades
The IBM SAN Volume Controller scalability features allow significant flexibility in its configuration. The IBM SAN Volume Controller family features the following types of enclosures:
Control Enclosures
These enclosures manage your storage systems, communicate with the host, and manage interfaces. Each “control enclosure” contains two nodes, which form an I/O group.
Expansion Enclosures
These enclosures increase the available capacity of an IBM SAN Volume Controller cluster. They communicate with the control enclosure through a dual pair of 12 Gbps serial-attached SCSI (SAS) connections. These expansion enclosures can house many flash (solid-state drive [SSD]) SAS type drives, but are not available for all IBM SAN Volume Controller models.
Each SAN Volume Controller node is an individual server in a SAN Volume Controller clustered system. A basic configuration of an IBM SAN Volume Controller storage platform consists of two IBM SAN Volume Controller nodes, known as an I/O group. The nodes are always installed in pairs and all I/O operations that are managed by the nodes in an I/O group are cached on both nodes. For a balanced increase of performance and scale, up to four I/O groups can be clustered into a single storage system.
Similarly, to increase capacity, up to two chains (depending on IBM SAN Volume Controller model) of expansion enclosures can be added per node. Consequently, several scenarios are possible for growth. These processes are described next.
10.10.1 Adding nodes
You can add nodes to replace the existing nodes of your SAN Volume Controller cluster with newer ones, and the replacement procedure can be performed non-disruptively. The new node can assume the WWNN of the node you are replacing, which requires no changes in host configuration, SAN zoning, or multipath software. For more information about this procedure, see this IBM Documentation web page.
Alternatively, you can add nodes to expand your system. If your IBM SAN Volume Controller cluster is below the maximum I/O groups limit for your specific product and you intend to upgrade it, you can install another I/O group.
It is also feasible that you might have a cluster of IBM Storwize V7000 nodes that you want to add the IBM SAN Volume Controller nodes to, because the IBM SAN Volume Controller nodes are more powerful than your existing nodes. Therefore, your cluster has different node models in different I/O groups.
To install these nodes, determine whether you need to upgrade your IBM SAN Volume Controller first (or Storwize V7000 code level if you are merging an Storwize V7000 Gen2 cluster with an IBM SAN Volume Controller, for example).
 
Note: If two I/O groups are in a system, you must set up a quorum disk or application outside of the system. If the two I/O groups lose communication with each other, the quorum disk prevents both I/O groups from going offline.
For more information about adding a node to an IBM SAN Volume Controller cluster, see Chapter 3 of Implementing the IBM SAN Volume Controller with IBM Spectrum Virtualize Version 8.4.2, SG24-8507.
Note: Use a consistent method (only the management GUI or only the CLI) when you add, remove, and re-add nodes. If a node is added by using the CLI and later re-added by using the GUI, it might get a different node name than it originally had.
After you install the newer nodes, you might need to redistribute your servers across the I/O groups. Consider the following points:
Moving a server’s volume to different I/O groups can be done online because of a feature called Non-Disruptive Volume Movement (NDVM). Although this process can be done without stopping the host, careful planning and preparation is advised. For more information about supported operating systems, see this IBM Support web page.
 
Note: You cannot move a volume that is in any type of Remote Copy relationship.
If each of your servers is zoned to only one I/O group, modify your SAN zoning configuration as you move its volumes to another I/O group. As best you can, balance the distribution of your servers across I/O groups according to I/O workload.
Use the -iogrp parameter in the mkhost command to define which I/O groups of the IBM SAN Volume Controller that the new servers will use. Otherwise, IBM SAN Volume Controller by default maps the host to all I/O groups, even if they do not exist and regardless of your zoning configuration. Example 10-7 shows this scenario and how to resolve it by using the rmhostiogrp and addhostiogrp commands.
Example 10-7 Mapping the host to I/O groups
IBM_2145:IBM Redbook SVC:superuser>lshost NYBIXTDB02
id 0
name NYBIXTDB02
port_count 2
type generic
mask 1111
iogrp_count 4
WWPN 10000000C9648274
node_logged_in_count 2
state active
WWPN 10000000C96470CE
node_logged_in_count 2
state active
IBM_2145:IBM Redbook SVC:superuser>lsiogrp
id name node_count vdisk_count host_count
0 io_grp0 2 32 1
1 io_grp1 0 0 1
2 io_grp2 0 0 1
3 io_grp3 0 0 1
4 recovery_io_grp 0 0 0
IBM_2145:IBM Redbook SVC:superuser>lshostiogrp NYBIXTDB02
id name
0 io_grp0
1 io_grp1
2 io_grp2
3 io_grp3
IBM_2145:IBM Redbook SVC:superuser>rmhostiogrp -iogrp 1:2:3 NYBIXTDB02
IBM_2145:IBM Redbook SVC:superuser>lshostiogrp NYBIXTDB02
id name
0 io_grp0
IBM_2145:IBM Redbook SVC:superuser>lsiogrp
id name node_count vdisk_count host_count
0 io_grp0 2 32 1
1 io_grp1 0 0 0
2 io_grp2 0 0 0
3 io_grp3 0 0 0
4 recovery_io_grp 0 0 0
IBM_2145:IBM Redbook SVC:superuser>addhostiogrp -iogrp 3 NYBIXTDB02
IBM_2145:IBM Redbook SVC:superuser>lshostiogrp NYBIXTDB02
id name
0 io_grp0
3 io_grp3
IBM_2145:IBM Redbook SVC:superuser>lsiogrp
id name node_count vdisk_count host_count
0 io_grp0 2 32 1
1 io_grp1 0 0 0
2 io_grp2 0 0 0
3 io_grp3 0 0 1
4 recovery_io_grp 0 0 0
 
If possible, avoid setting a server to use volumes from different I/O groups that have different node types for extended periods of time. Otherwise, as this server’s storage capacity grows, you might experience a performance difference between volumes from different I/O groups. This mismatch makes it difficult to identify and resolve eventual performance problems.
Adding hot-spare nodes
To reduce the risk of a loss of redundancy or degraded system performance, hot-spare nodes can be added to the system. A hot-spare node has active system ports, but no host I/O ports, and is not part of any I/O group. If any node fails or is upgraded, this spare node automatically joins the system and assumes the place of the failed node, restoring redundancy.
The hot-spare node uses the same N_Port ID Virtualization (NPIV) worldwide port names (WWPNs) for its Fibre Channel ports as the failed node, so host operations are not disrupted. After the failed node returns to the system, the hot-spare node returns to the Spare state, which indicates it can be automatically swapped for other failed nodes on the system.
The following restrictions apply to the use of hot-spare node on the system:
Hot-spare nodes can be used with Fibre Channel-attached external storage only
Hot-spare nodes cannot be used:
 – In systems that use RDMA-capable Ethernet ports for node-to-node communications
 – On enclosure-based systems
 – With SAS-attached storage
 – With iSCSI-attached storage
 – With storage that is directly attached to the system
A maximum of four hot-spare nodes can be added to the system.
Using the management GUI
If your nodes are configured on your systems and you want to add hot-spare nodes, you must connect the extra nodes to the system. After hot-spare nodes are configured correctly on the system, you can add the spare node to the system configuration by completing the following steps:
1. In the management GUI, select Monitoring → System Hardware.
2. On the System Hardware - Overview window, click Add Nodes.
3. On the Add Node page, select the hot-spare node to add to the system.
If your system uses stretched or HyperSwap system topology, hot-spare nodes must be designated per site.
Using the command-line interface
To add a spare node to the system, run the following command:
addnode -panelname <panel_name> -spare
Where <panel_name> is the name of the node that is displayed in the service assistant or in the output of the lsnodecandidate command.
For more information see, IBM Spectrum Virtualize Hot-Spare Node and NPIV Target Ports, REDP-5477.
10.10.2 Upgrading nodes in an existing cluster
If you want to upgrade the nodes of your IBM SAN Volume Controller, the option is available to increase the cache memory size or the adapter cards in each node. This process can be done, one node at a time so as to be nondisruptive to the systems operations.
For more information, see this IBM Documentation web page.
When evaluating cache memory upgrades, consider the following points:
As your working set and total capacity increases, consider increasing your cache memory size. A working set is the most accessed workloads, excluding snapshots and backups. Total capacity implies more or larger workloads and a larger working set.
If you are consolidating from multiple controllers, consider at least matching the amount of cache memory across those controllers.
When externally virtualizing controllers (such as IBM SAN Volume Controller), a large cache can accelerate older controllers with smaller caches.
If you use a Data Reduction Pool (DRP), maximize the cache size and consider adding SCM drives with Easy Tier for the best performance.
If you are making heavy use of copy services, consider increasing the cache beyond just your working set requirements.
A truly random working set might not benefit greatly from the cache.
 
Important: Do not power on a node that is:
Shown as offline in the management GUI if you powered off the node to add memory to increase total memory. Before you increase memory, you must remove a node from the system so that it is not showing in the management GUI or in the output from the lsnode command.
Still in the system and showing as offline with more memory than the node had when it powered off. Such a node can cause an immediate outage or an outage when you update the system software.
When evaluating adapter card upgrades, consider the following points:
A single 32 Gb Fibre Channel port can deliver over 3 GBps (allowing for overheads).
A 32 Gb FC card in each node with eight ports can deliver more than 24 GBps.
An FCM NVMe device can perform at over 1 GBps.
A single 32 Gb Fibre Channel port can deliver 80,000 - 125,000 IOPS with a 4 k block size.
A 32 Gb FC card in each node with eight ports can deliver up to 1,000,000 IOPS.
A FlashSystem 9200 can deliver 1,200,000 4 k read miss IOPS and up to 4,500,000 4 k read hit IOPS.
If you have more than 12 NVMe devices, consider the use of two Fibre Channel cards per node. By using a third Fibre Channel card, up to 45 GBps can be achieved.
If you want to achieve more than 800,000 IOPS, use at least two Fibre Channel cards per node.
If the SAN Volume Controller is performing Remote Copy or clustering, consider the use of separate ports to ensure no conflict exists with host traffic.
iSER by way of 25 GbE ports have similar capabilities as 16 Gb FC ports, but with less overall ports available. If you are planning to use 10 Gb iSCSI, ensure that it can service your expected workloads.
Real-time performance statistics are available in the management GUI from the Monitoring → Performance menu, as show in Figure 10-12.
Figure 10-12 IBM SAN Volume Controller performance statistics (IOPS)
Memory options for an IBM SAN Volume Controller SV2 and SA2
The standard memory per node is 128 GB base memory with an option, by adding 32 GB memory modules, to support up to 768 GB (SA2) or 768 GB (SV2) of memory.
If you are adding memory to a node, you must remove that node from the system configuration before you start the following procedure. To do so, you can use the management GUI or the CLI.
To use the management GUI, select Monitoring  System Hardware. On the System Hardware - Overview page, select the directional arrow next to the node that you are removing to open the node details page. Select Node Actions  Remove.
To use the CLI, enter the following command, where object_id | object_name identifies the node that receives the added memory:
rmnodecanister object_id | object_name
A CPU processor has six memory channels, which are labeled A - F. Each memory channel has two DIMM slots, numbered 0 - 1. For example, DIMM slots A0 and A1 are in memory channel A.
On the system board, the DIMM slots are labeled according to their memory channel and slot. They are associated with the CPU nearest to their DIMM slots. You can install three distinct memory configurations in those 24 DIMM slots in each node.
The available memory configuration for each node is listed in Table 10-2. Each column shows the valid configuration for each total enclosure memory size. DIMM slots are listed in the same order that they appear in the node.
To ensure proper cooling and a steady flow of air from the fan modules in each node, blank DIMMs must be inserted in any slot that does not contain a memory module.
Table 10-2 Available memory configuration for one node
DIMM slot
128 GB of total node memory
384 GB of total node memory
768 GB of total node memory
F0 (CPU0)
Blank
32 GB
32 GB
F1 (CPU0)
Blank
Blank
32 GB
E0 (CPU0)
Blank
32 GB
32 GB
E1 (CPU0)
Blank
Blank
32 GB
D0 (CPU0)
32 GB
32 GB
32 GB
D1 (CPU0)
Blank
Blank
32 GB
A1 (CPU0)
Blank
Blank
32 GB
A0 (CPU0)
32 GB
32 GB
32 GB
B1 (CPU0)
Blank
Blank
32 GB
B0 (CPU0)
Blank
32 GB
32 GB
C1 (CPU0)
Blank
Blank
32 GB
C0 (CPU0)
Blank
32 GB
32 GB
C0 (CPU1)
Blank
32 GB
32 GB
C1 (CPU1)
Blank
Blank
32 GB
B0 (CPU1)
Blank
32 GB
32 GB
B1 (CPU1)
Blank
Blank
32 GB
A0 (CPU1)
32 GB
32 GB
32 GB
A1 (CPU1)
Blank
Blank
32 GB
D1 (CPU1)
Blank
Blank
32 GB
D0 (CPU1)
32 GB
32 GB
32 GB
E1 (CPU1)
Blank
Blank
32 GB
E0 (CPU1)
Blank
32 GB
32 GB
F1 (CPU1)
Blank
Blank
32 GB
F0 (CPU1)
Blank
32 GB
32 GB
Memory options for an IBM SAN Volume Controller SV1
The standard memory per node is 64 GB base memory with an option, by adding 32 GB memory modules, to support up to 256 GB of memory.
If you are adding memory to a node, you must remove that node from the system configuration before you start the following procedure. To do so, you can use the management GUI or the CLI.
To use the management GUI, select Monitoring  System Hardware. On the System Hardware - Overview page, select the directional arrow next to the node that you are removing to open the node details page. Select Node Actions  Remove.
To use the CLI, enter the following command, where object_id | object_name identifies the node that receives the added memory:
rmnodecanister object_id | object_name
Memory options for an IBM SAN Volume Controller DH8
The standard memory per node is 3 2GB base memory with an option, by adding a second CPU with 32 GB memory, to support up to 64 GB of memory.
If you are adding memory to a node, you must remove that node from the system configuration before you start the following procedure. To do so, you can use the management GUI or the CLI.
To use the management GUI, select Monitoring  System Hardware. On the System Hardware - Overview page, select the directional arrow next to the node that you are removing to open the node details page. Select Node Actions  Remove.
To use the CLI, enter the following command, where object_id | object_name identifies the node that receives the added memory:
rmnodecanister object_id | object_name
Adapter card options for an IBM SAN Volume Controller SV2 and SA2
Four 10 Gb Ethernet ports for iSCSI connectivity are standard, but models SV2 and SA2 support the following optional host adapters for extra connectivity (feature codes in brackets):
4-port 16 Gbps Fibre Channel over NVMe adapter (AH14)
4-port 32 Gbps Fibre Channel over NVMe adapter (AH1D)
2-port 25 Gbps iSCSI/RoCE adapter (AH16)
2-port 25 Gbps iSCSI/iWARP adapter (AH17)
No more than three I/O adapter card features (AH14, AH16, AH17, and AH1D) can be used in a node. For more information about which specific slot supports which adapter type, see the following resources:
Unlike in previous models, a Compression Accelerator is integrated directly with the processors for DRP compression workloads.
Adapter card options for an IBM SAN Volume Controller SV1
Three 10 Gb Ethernet ports for iSCSI connectivity are standard, but model SV1 supports the following optional host adapters for additional connectivity (feature codes in brackets):
4-port 10 Gbps iSCSI/FC over Ethernet adapter (AH12)
4-port 16 Gbps Fibre Channel over NVMe adapter (AH14)
2-port 25 Gbps iSCSI/RoCE adapter (AH16)
2-port 25 Gbps iSCSI/iWARP adapter (AH17)
No more than four I/O adapter card features (AH12, AH14, AH16, or AH17) can be used in a node. It also can provide up to 16 16-Gb FC ports, up to four 10-Gb Ethernet (iSCSI/FCoE) ports, or up to eight 25 Gb Ethernet (iSCSI) ports.
 
Note: FCoE is no longer supported in Spectrum Virtualize 8.4.
For more information about which specific slot supports which adapter type, see the following resources:
The optional compression co-processor adapter increases the speed of I/O transfers to and from compressed volumes by using IBM Real-time Compression. You can optionally install one or two compression accelerator adapters in a SAN Volume Controller 2145-SV1 node. Each compression accelerator increases the speed of I/O transfers between nodes and compressed volumes. You must install at least one compression accelerator if you configured compressed volumes.
Adapter card options for an IBM SAN Volume Controller DH8
Three 1 Gb Ethernet ports for iSCSI connectivity are standard, but model DH8 supports the following optional host adapters for more connectivity (feature codes in brackets):
4-port 8 Gbps Fibre Channel adapter (AH10)
2-port 16 Gbps Fibre Channel adapter (AH11)
4-port 16 Gbps Fibre Channel adapter (AH14)
2-port 10 Gbps iSCSI/FCoE adapter (AH12)
The maximum numbers and combinations of new adapters depends on the number of CPUs in the node and numbers and types of existing adapters. For more information about which specific slot supports which adapter type, see the following resources:
For more information about which specific slot supports which adapter type, see the following resources:
The optional compression co-processor adapter increases the speed of I/O transfers to and from compressed volumes by using IBM Real-time Compression. You can optionally install one or two compression accelerator adapters in a SAN Volume Controller 2145-DH8 node. Each compression accelerator increases the speed of I/O transfers between nodes and compressed volumes. You must install at least one compression accelerator if you configured compressed volumes.
10.10.3 Moving to a new IBM SAN Volume Controller cluster
You might have a highly populated, intensively used IBM SAN Volume Controller cluster that you want to upgrade. You might also want to use the opportunity to refresh your IBM SAN Volume Controller and SAN storage environment.
Complete the following steps to replace your cluster with a newer, more powerful cluster:
1. Install your new IBM SAN Volume Controller cluster.
2. Create a replica of your data in your new cluster.
3. Migrate your servers to the new IBM SAN Volume Controller cluster when convenient.
If your servers can tolerate a brief, scheduled outage to switch from one IBM SAN Volume Controller cluster to another, you can use the IBM SAN Volume Controller Remote Copy services (Metro Mirror or Global Mirror) to create your data replicas, by completing the following steps:
1. Select a host that you want to move to the new IBM SAN Volume Controller cluster and find all the old volumes you must move.
2. Zone your host to the new IBM SAN Volume Controller cluster.
3. Create Remote Copy relationships from the old volumes in the old IBM SAN Volume Controller cluster to new volumes in the new IBM SAN Volume Controller cluster.
4. Map the new volumes from the new IBM SAN Volume Controller cluster to the host.
5. Discover new volumes on the host.
6. Stop all I/O from the host to the old volumes from the old IBM SAN Volume Controller cluster.
7. Disconnect and remove the old volumes on the host from the old IBM SAN Volume Controller cluster.
8. Unmap the old volumes from the old IBM SAN Volume Controller cluster to the host.
9. Ensure that the Remote Copy relationships between old and new volumes in the old and new IBM SAN Volume Controller cluster are synced.
10. Stop and remove Remote Copy relationships between old and new volumes so that the target volumes in the new IBM SAN Volume Controller cluster receive read/write access.
11. Import data from the new volumes and start your applications on the host.
If you must migrate a server online instead, you must use host-based mirroring by completing the following steps:
1. Select a host that you want to move to the new IBM SAN Volume Controller cluster and find all the old volumes that you must move.
2. Zone your host to the new IBM SAN Volume Controller cluster.
3. Create volumes in the new IBM SAN Volume Controller cluster of the same size as the old volumes in the old IBM SAN Volume Controller cluster.
4. Map the new volumes from the new IBM SAN Volume Controller cluster to the host.
5. Discover new volumes on the host.
6. For each old volume, use host-based mirroring (such as AIX mirrorvg) to move your data to the corresponding new volume.
7. For each old volume, after the mirroring is complete, remove the old volume from the mirroring group.
8. Disconnect and remove the old volumes on the host from the old IBM SAN Volume Controller cluster.
9. Unmap the old volumes from the old IBM SAN Volume Controller cluster to the host.
This approach uses the server’s computing resources (CPU, memory, and I/O) to replicate the data. It can be done online if properly planned. Before you begin, ensure that it includes enough spare resources.
The biggest benefit to the use of either approach is that they easily accommodate (if necessary) the replacement of your SAN switches or your back-end storage controllers. You can upgrade the capacity of your back-end storage controllers or replace them entirely, as you can replace your SAN switches with bigger or faster ones. However, you do need to have spare resources, such as floor space, power, cables, and storage capacity, available during the migration.
10.10.4 Splitting an IBM SAN Volume Controller cluster
Splitting an IBM SAN Volume Controller cluster might become a necessity if you have one or more of the following requirements:
To grow the environment beyond the maximum number of I/O groups that a clustered system can support.
To grow the environment beyond the maximum number of attachable subsystem storage controllers.
To grow the environment beyond any other maximum system limit.
To achieve new levels of data redundancy and availability.
By splitting the clustered system, you no longer have one IBM SAN Volume Controller that handles all I/O operations, hosts, and subsystem storage attachments. The goal is to create a second IBM SAN Volume Controller cluster so that you can equally distribute the workload over the two systems.
After safely removing enclosures from the existing cluster and creating a second IBM SAN Volume Controller cluster, choose from the following approaches to balance the two systems:
Attach new storage subsystems and hosts to the new system and start adding only new workload on the new system.
Migrate the workload onto the new system by using the approach that is described in 10.10.3, “Moving to a new IBM SAN Volume Controller cluster” on page 468.
10.10.5 Adding expansion enclosures
As time passes and your environment grows, you must add more storage to your system. Depending on the IBM SAN Volume Controller family product and the code level that you installed, you can add different numbers of expansion enclosures to your system. Before you add an enclosure to a system, check that the licensed functions of the system support the extra enclosure.
Because all IBM SAN Volume Controller models were designed to make managing and maintaining them as simple as possible, adding an expansion enclosure is an easy task.
IBM SAN Volume Controller SV2 and SA2
IBM SAN Volume Controller SV2 and SA2 models do not support any type of SAS expansion enclosures; all storage is external back-end storage. The IBM SAN Volume Controller can virtualize external storage that is presented to it from IBM and third-party storage systems. External back-end storage systems provide their logical volumes (LUs), which are detected by the IBM SAN Volume Controller as MDisks and can be used in a storage pool.
IBM SAN Volume Controller SV1
The IBM SAN Volume Controller model SV1 can also support expansion enclosures with the following models:
The IBM 2145 SVC LFF Expansion Enclosure Model 12F
This model holds up to 12 3.5-inch SAS drives in a 2U, 19-inch rack mount enclosure.
The IBM 2145 SVC SFF Expansion Enclosure Model 24F
This model holds up to 24 2.5-inch SAS internal flash (solid state) drives in a 2U, 19-inch rack mount enclosure.
The IBM 2145 SVC HD LFF Expansion Enclosure Model 92F
This model holds up to 92 3.5-inch SAS internal flash (solid state) or HDD capacity drives in a 5U, 19-inch rack mount enclosure.
If Enterprise Class Support and three-year warranty is purchased, the model number changes from 2145 to 2147.
Each IBM SAN Volume Controller SV1 supports two chains of SAS expansion enclosures per node. Overall, the system supports up to four I/O groups (eight nodes) with a total of 80 expansion enclosures per system.
The best practice recommendation is to balance equally the expansion enclosures between chains. Therefore, if you have two more expansion enclosures, one should be installed on the first SAS chain and one on the second SAS chain. Also, when you add a single expansion enclosure to a system, it is preferable to add the enclosure directly below the nodes.
When you add a second expansion enclosure, it is preferable to add the enclosure directly above the nodes. As more expansion enclosures are added, alternate adding them above and below.
To limit contention for bandwidth on a chain of SAS enclosures, no more than 10 expansion enclosures can be chained to the SAS port of a node. On each SAS chain, the systems can support up to a SAS chain weight of 10 where:
Each 2145-92F expansion enclosure adds a value of 2.5 to the SAS chain weight.
Each 2145-12F or 2145-24F expansion enclosure adds a value of 1 to the SAS chain weight.
For example, each of the following expansion enclosure configurations has a total SAS weight of 10:
Four 2145-92F enclosures per SAS chain
Two 2145-92F enclosures and five 2145-12F enclosures per SAS chain
Figure 10-13 shows the cabling for adding four 2145-24F expansion enclosures (two at the top and two at the bottom of the figure).
For more information, see this IBM Documentation web page.
Figure 10-13 Cabling for adding four expansion enclosures in two SAS chains
IBM SAN Volume Controller DH8
IBM SAN Volume Controller Model DH8 can also support expansion enclosures with the following models:
The IBM 2145 SVC LFF Expansion Enclosure Model 12F
This model holds up to 12 3.5-inch SAS drives in a 2U, 19-inch rack mount enclosure.
The IBM 2145 SVC SFF Expansion Enclosure Model 24F
This model holds up to 24 2.5-inch SAS internal flash (solid state) drives in a 2U, 19-inch rack mount enclosure.
The IBM 2145 SVC HD LFF Expansion Enclosure Model 92F
This model holds up to 92 3.5-inch SAS internal flash (solid state) drive capacity drives in a 5U, 19-inch rack mount enclosure.
If Enterprise Class Support and three-year warranty is purchased, the model number changes from 2145 to 2147.
Similar to the SV1 model, each model DH8 supports two chains of SAS expansion enclosures per node. Overall, the system supports up to four I/O groups (eight nodes) with a total of 80 expansion enclosures per system.
10.10.6 Removing expansion enclosures
As storage environments change and grow, it is sometimes necessary to move expansion enclosures between nodes. Removing an expansion enclosure is a straightforward task.
To remove an expansion enclosure from a node, complete the following steps:
 
Note: If the expansion enclosure that you want to move is not at the end of an SAS chain, you might need a longer pair of SAS cables to complete the procedure. In that case, ensure that you have two SAS cables of suitable length before you start this procedure.
1. Delete any volumes that are no longer needed and that depend on the enclosure that you plan to remove.
2. Delete any remaining arrays that are formed from drives in the expansion enclosure. Any data in those arrays is automatically migrated to other managed disks in the pool if there is enough capacity.
3. Wait for data migration to complete.
4. Mark all the drives (including any configured as spare or candidate drives) in the enclosures to be removed as unused.
5. Unmanage and remove the expansion enclosure by using the management GUI. Select Monitoring  System Hardware. On the System Hardware - Overview page, select the arrow next to the enclosure that you are removing to open the Enclosure Details page. Select Enclosure Actions  Remove.
 
Important: Do not proceed until the enclosure removal process completes successfully.
6. On the I/O group that contains the expansion enclosure that you want to remove, enter the following command to put the I/O group into maintenance mode:
chiogrp -maintenance yes <iogroup_name_or_id>
7. If the expansion enclosure that you want to move is at the end of a SAS chain, complete the following steps to remove the enclosure from the SAS chain:
a. Disconnect the SAS cable from port 1 of node 1 and node 2. The enclosure is now disconnected from the system.
b. Disconnect the other ends of the SAS cables from the previous enclosure in the SAS chain. The previous enclosure is now the end of the SAS chain. Proceed to step 10.
8. If the expansion enclosure is not at the end of a SAS chain, complete the following steps to remove the enclosure from the SAS chain.
a. Disconnect the SAS cable from port 2 of node 1 of the expansion enclosure that you want to move.
b. Disconnect the other end of the same SAS cable from port 1 of node 1 of the next expansion enclosure in the SAS chain.
c. Disconnect the SAS cable from port 1 of node 1 of the expansion enclosure that you want to move.
d. Reroute the cable that was disconnected in the previous step and connect it to port 1 of node 1 of the next expansion enclosure in the SAS chain.
 
Important: Do not continue until you complete this cable connection step.
e. Disconnect the SAS cable from port 2 of node 2 of the expansion enclosure that you want to move.
f. Disconnect the other end of the same SAS cable from port 1 of node 2 of the next expansion enclosure in the SAS chain.
g. Disconnect the SAS cable from port 1 of node 2 of the expansion enclosure that you want to move.
h. Reroute the cable that was disconnected in the previous step and connect it to port 1 of node 2 of the next expansion enclosure in the SAS chain.
9. Take the I/O group out of maintenance mode by entering the following command:
chiogrp -maintenance no iogroup_name_or_id
10. Check the event log for any errors and fix those errors as needed.
11. Disconnect the power from the expansion enclosure that you want to remove.
12. Remove the expansion enclosure from the rack along with its two power cables and two SAS cables.
 
Note: The IBM SAN Volume Controller products provide methods to securely erase data from a drive when an enclosure is decommissioned or before a drive is removed from the system during a repair activity.
For more information about the CLI commands that are used to run this secure erase function, see this IBM Documentation web page.
10.11 I/O throttling
I/O throttling is a mechanism with which you can limit the volume of I/O processed by the storage controller at various levels to achieve quality of service (QoS). If a throttle is defined, the system processes the I/O or delays the processing of the I/O to free resources for more critical I/O. Throttling is a way to achieve a better distribution of storage controller resources.
IBM SAN Volume Controller V8.3 and later code brings the possibility to set the throttling at a volume level, host, host cluster, storage pool, and offload throttling by using the GUI. This section describes I/O throttling and shows how to configure the feature in your system.
10.11.1 I/O throttling overview
I/O throttling features the following characteristics:
Both IOPS and bandwidth throttle limits can be set.
It is an upper bound QoS mechanism.
No minimum performance is guaranteed.
Volumes, hosts, host clusters, and managed disk groups can be throttled.
Queuing occurs at microsecond granularity.
Internal I/O operations (such as FlashCopy and cluster traffic) are not throttled.
Reduces I/O bursts and smooths the I/O flow with variable delay in throttled I/Os.
Throttle limit is a per-node value.
10.11.2 I/O throttling on front-end I/O control
You can use throttling for a better front-end I/O control at the volume, host, host cluster, and offload levels:
In a multi-tenant environment, hosts can have their own defined limits.
You can use this to allow restricted I/Os from a data mining server and a higher limit for an application server.
An aggressive host consuming bandwidth of the controller can be limited by a throttle.
For example, a video streaming application can have a limit set to avoid consuming too much of the bandwidth.
Restrict a group of hosts by their throttles.
For example, Department A gets more bandwidth than Department B.
Each volume can have a throttle defined.
For example, a volume that is used for backups can be configured to use less bandwidth than a volume used for a production database.
When performing migrations in a production environment consider, the use of host or volume level throttles.
Offloaded I/Os.
Offload commands, such as UNMAP and XCOPY, free hosts and speed the copy process by offloading the operations of certain types of hosts to a storage system. These commands are used by hosts to format new file systems or copy volumes without the host needing to read and then write data.
Throttles can be used to delay processing for offloads to free bandwidth for other more critical operations, which can improve performance but limits the rate at which host features, such as VMware VMotion, can copy data.
10.11.3 I/O throttling on back-end I/O control
You also can use throttling to control the back-end I/O by throttling the storage pools, which can be useful in the following scenarios:
Defines the throttle of each storage pool.
Controls back-end I/Os from the IBM SAN Volume Controller.
Avoids overwhelming any external back-end storage.
Creates a VVOL in a child pool. A child pool (mdiskgrp) throttle can control I/Os coming from that VVOL.
Supports only parent pool throttles because only parent pools contain MDisks from internal or external back-end storage. For volumes in child pools, the throttle of the parent pool is applied.
If more than one throttle applies to an I/O operation, uses the lowest and most stringent throttle. For example, if a throttle of 100 MBps is defined on a pool and a throttle of 200 MBps is defined on a volume of that pool, the I/O operations are limited to 100 MBps.
10.11.4 Overall benefits of using I/O throttling
The overall benefits of the use of I/O throttling is a better distribution all system resources:
Avoids overwhelming the controller objects.
Avoids starving the external entities, such as hosts, from their share of controller.
A scheme of distribution of controller resources that, in turn, results in better use of external resources, such as host capacities.
With no throttling enabled, we have a scenario in which Host 1 dominates the bandwidth. After enabling the throttle, we see a much better distribution of the bandwidth among the hosts, as shown in Figure 10-14.
Figure 10-14 Distribution of controller resources before and after I/O throttling
10.11.5 I/O throttling considerations
When you are planning to use I/O throttling, consider the following points:
The throttle cannot be defined for the host if it is part of a host cluster, which includes a host cluster throttle.
If the host cluster does not have a throttle defined, its member hosts can have their individual host throttles defined.
If a volume has multiple copies, throttling is done for the storage pool serving the primary copy. The throttling will not be applicable on the secondary pool for mirrored volumes and stretched cluster implementations.
A host cannot be added to a host cluster if both have their individual throttles defined. If only one of the host/host cluster throttles is present, the command succeeds.
A seeding host that is used for creating a host cluster cannot have a host throttle defined for it.
 
Note: Throttling is applicable only at the I/Os that an IBM SAN Volume Controller receives from hosts and host clusters. The I/Os generated internally, such as mirrored volume I/Os, cannot be throttled.
10.11.6 Configuring I/O throttling using the CLI
To create a throttle by using the CLI, you use the mkthrottle command (see Example 10-8). The bandwidth limit is the maximum amount of bandwidth the system can process before the system delays I/O processing. Similarly, the iops_limit is the maximum amount of IOPS the system can process before the system delays I/O processing.
Example 10-8 Creating a throttle using the mkthrottle command in the CLI
Syntax:
 
mkthrottle -type [offload | vdisk | host | hostcluster | mdiskgrp]
[-bandwidth bandwidth_limit_in_mb]
[-iops iops_limit]
[-name throttle_name]
[-vdisk vdisk_id_or_name]
[-host host_id or name]
[-hostcluster hostcluster_id or name]
[-mdiskgrp mdiskgrp_id or name]
 
Usage examples:
IBM_2145:IBM Redbook SVC:superuser>mkthrottle -type host -bandwidth 100 -host ITSO_HOST3
IBM_2145:IBM Redbook SVC:superuser>mkthrottle -type hostcluster -iops 30000 -hostcluster ITSO_HOSTCLUSTER1
IBM_2145:IBM Redbook SVC:superuser>mkthrottle -type mdiskgrp -iops 40000 -mdiskgrp 0
IBM_2145:IBM Redbook SVC:superuser>mkthrottle -type offload -bandwidth 50
IBM_2145:IBM Redbook SVC:superuser>mkthrottle -type vdisk -bandwidth 25 -vdisk volume1
 
IBM_2145:IBM Redbook SVC:superuser>lsthrottle
throttle_id throttle_name object_id object_name throttle_type IOPs_limit bandwidth_limit_MB
0 throttle0 2 ITSO_HOST3 host 100
1 throttle1 0 ITSO_HOSTCLUSTER1 hostcluster 30000
2 throttle2 0 Pool0 mdiskgrp 40000
3 throttle3 offload 50
4 throttle4 10 volume1 vdisk 25
 
Note: You can change a throttle parameter by using the chthrottle command.
10.11.7 Configuring I/O throttling using the GUI
In this section, we describe how to configure the throttle by using the management GUI.
Creating a volume throttle
To create a volume throttle, go to Volumes → Volumes. Then, select the wanted volume, right-click it, and chose Edit Throttle, as shown in Figure 10-15. The bandwidth can be set from 1 MBps - 256 TBps and IOPS can be set from 1 - 33,254,432.
Figure 10-15 Creating a volume throttle in the GUI
If a throttle exists, the dialog box that is shown in Figure 10-15 also shows a Remove button that is used to delete the throttle.
Creating a host throttle
To create a host throttle, go to Hosts → Hosts and select the wanted host. Then, right-click it and chose Edit Throttle, as shown in Figure 10-16.
Figure 10-16 Creating a host throttle in the GUI
Creating a host cluster throttle
To create a host cluster throttle, go to Hosts → Host Clusters and select the wanted host cluster. Then, right-click it and chose Edit Throttle, as shown in Figure 10-17.
Figure 10-17 Creating a host cluster throttle in the GUI
Creating a storage pool throttle
To create a storage pool throttle, go to Pools → Pools and select the wanted storage pool. Then, right-click it and choose Edit Throttle, as shown in Figure 10-18.
Figure 10-18 Creating a storage pool throttle in the GUI
Creating an offload throttle
To create an offload throttle, go to Monitoring → System Hardware → System Actions. Then, select Edit System Offload Throttle, as shown in Figure 10-19.
Figure 10-19 Creating system offload throttle in the GUI
10.12 Automation
Automation is a priority for maintaining today’s busy storage environments. Automation software allows the creation of repeatable sets of instructions and processes to reduce the need for human interaction with computer systems. Red Hat Ansible and other third-party automation tools are becoming increasing used across the enterprise IT environments and it is not unexpected that their use in storage environments is becoming more popular.
10.12.1 Red Hat Ansible
IBM SAN Volume Controller family includes integration with Red Hat Ansible Automation Platform, allowing IT to create an Ansible playbook that automates repetitive tasks across an organization in a consistent way, which helps improve outcomes and reduce errors.
Ansible is an agentless automation management tool that uses the SSH protocol. Currently, Ansible can be run from any machine with Python 2 (version 2.7) or Python 3 (versions 3.5 and higher) installed. This includes Red Hat, Debian, CentOS, macOS, any of the BSDs. Windows is not supported for the Ansible control node.
IBM is a Red Hat certified support module vendor, providing simple management for the following commands that are used in the IBM Spectrum Virtualize Ansible Collection:
Collect facts: Collect basic information including hosts, host groups, snapshots, consistency groups, and volumes
Manage:
 – Hosts: Create, delete, or modify hosts
 – Volumes: Create, delete, or extend the capacity of volumes
 – MDisk: Create or delete a managed disk
 – Pool: Create or delete a pool (managed disk group)
 – Volume map: Create or delete a volume map
 – Consistency group snapshot: Create or delete consistency group snapshots
 – Snapshot: Create or delete snapshots
 – Volume clones: Create or delete volume clones
This collection provides a series of Ansible modules and plug-ins for interacting with the IBM Spectrum Virtualize Family storage systems. The modules in the IBM Spectrum Virtualize Ansible collection uses the Representational State Transfer (REST) application programming interface (API) to connect to the IBM Spectrum Virtualize storage system. These storage systems include the IBM SAN Volume Controller, IBM FlashSystem family including FlashSystem 5010, 5030, 5100, 7200, 9100, 9200, and 9200R and IBM Spectrum Virtualize for Public Cloud.
For more information, see Automate and Orchestrate® Your IBM FlashSystem Hybrid Cloud with Red Hat Ansible, REDP-5598.
For IBM Spectrum Virtualize modules, Ansible version 2.9 or higher is required. For more information about IBM Spectrum Virtualize modules, see Ansible Collections for IBM Spectrum Virtualize.
10.12.2 RESTful API
The Spectrum Virtualize REST model API consists of command targets that are used to retrieve system information and to create, modify, and delete system resources. These command targets allow command parameters to pass through unedited to the Spectrum Virtualize command-line interface, which handles parsing parameter specifications for validity and error reporting. Hypertext Transfer Protocol Secure (HTTPS) is used to communicate with the RESTful API server.
To interact with the storage system by using the RESTful API, use the curl utility (see https://curl.se to make an HTTPS command request with a valid configuration node URL destination. Open TCP port 7443 and include the keyword rest followed by the Spectrum Virtualize target command you want to run.
Each curl command takes the following form:
curl –k –X POST –H <header_1> –H <header_2> ... -d <JSON input> https://SVC_ip_address:7443/rest/target
Where the following definitions apply:
POST is the only HTTPS method that the Spectrum Virtualize RESTful API supports.
Headers <header_1> and <header_2> are individually-specified HTTP headers (for example, Content-Type and X-AuthUsername).
-d is followed by the JSON input; for example, ‘{“raid_level”: “raid5”}’.
<SVC_ip_address> is the IP address of the IBM SAN Volume Controller to which you are sending requests.
<target> is the target object of commands, which includes any object IDs, names, and parameters.
Authentication
Aside from data encryption, the HTTPS server requires authentication of a valid user name and password for each API session. Use two authentication header fields to specify your credentials: X-Auth-Username and X-Auth-Password.
Initial authentication requires that you POST the authentication target (/auth) with the user name and password. The RESTful API server returns a hexadecimal token. A single session lasts a maximum of two active hours or 30 inactive minutes, whichever occurs first.
When your session ends because of inactivity, or if you reach the maximum time that is allotted, error code 403 indicates the loss of authorization. Use the /auth command target to reauthenticate with the user name and password.
The following example shows the correct procedure for authenticating. You authenticate by first producing an authentication token and then use that token in all future commands until the session ends.
For example, the following command passes the authentication command to IBM SAN Volume Controller node IP address 192.168.10.20 at port 7443:
curl –k –X POST –H ‘Content-Type: application/json’ –H ‘X-Auth-Username: superuser’ –H ‘X-Auth-Password: passw0rd’ https://192.168.10.20:7443/rest/auth
 
Note: Make sure that you format the request correctly by using spaces after each colon in each header; otherwise, the command fails.
This request yields an authentication token, which can be used for all subsequent commands; for example:
{"token": "38823f60c758dca26f3eaac0ffee42aadc4664964905a6f058ae2ec92e0f0b63"}
Example command
Most actions must be taken only after authentication. The following example of creating an array demonstrates the use of the previously generated token in place of the authentication headers that are used in the authentication process.
curl –k –X POST –H ‘Content-Type: application/json’ –H ‘X-Auth-Token:
38823f60c758dca26f3eaac0ffee42aadc4664964905a6f058ae2ec92e0f0b63’
–d ‘{"level": "raid5", "drive": "6:7:8:9:10", "raid6grp"}’ https://192.168.10.20:7443/rest/mkarray
For more information RESTful API, see the following resources:
10.13 Documenting IBM SAN Volume Controller and SAN environment
This section focuses on the challenge of automating the documentation that is needed for an IBM SAN Volume Controller solution. Consider the following points:
Several methods and tools are available to automate the task of creating and updating the documentation. Therefore, the IT infrastructure might handle this task.
Planning is key to maintaining sustained and organized growth. Accurate documentation of your storage environment is the blueprint with which you plan your approach to short-term and long-term storage growth.
Your storage documentation must be conveniently available and easy to consult when needed. For example, you might need to determine how to replace your core SAN directors with newer ones, or how to fix the disk path problems of a single server. The relevant documentation might consist of a few spreadsheets and a diagram.
Remember to include photographs in the documentation, where suitable.
 
Storing documentation: Avoid storing IBM SAN Volume Controller and SAN environment documentation only in the SAN. If your organization has a Disaster Recovery (DR) plan, include this storage documentation in it. Follow its guidelines about how to update and store this data. If no DR plan exists and you have the suitable security authorization, it might be helpful to store an updated copy offsite.
In theory, this IBM SAN Volume Controller and SAN environment documentation is written at a level that is sufficient for any system administrator who has average skills in the products to understand. Make a copy that includes all your configuration information.
Use the copy to create a functionally equivalent copy of the environment by using similar hardware without any configuration, off-the-shelf media, and configuration backup files. You might need the copy if you ever face a DR scenario, which is also why it is so important to run periodic DR tests.
Create the first version of this documentation (“as-built documentation”) as you install your solution. If you completed forms to help plan the installation of your IBM SAN Volume Controller solution, use these forms to help you document how your IBM SAN Volume Controller solution was first configured. Minimum documentation is needed for an IBM SAN Volume Controller solution. Because you might have more business requirements that require other data to be tracked, remember that the following sections do not address every situation.
10.13.1 Naming conventions
Whether you are creating your IBM SAN Volume Controller and SAN environment documentation, or you are updating what is in place, first evaluate whether you have a good naming convention in place. With a good naming convention, you can quickly and uniquely identify the components of your IBM SAN Volume Controller and SAN environment. System administrators can then determine whether a name belongs to a volume, storage pool, MDisk, host, or HBA by looking at it.
Because error messages often point to the device that generated an error, a good naming convention quickly highlights where to start investigating when an error occurs. Typical IBM SAN Volume Controller and SAN component names limit the number and type of characters you can use. For example, IBM SAN Volume Controller names are limited to 63 characters, which makes creating a naming convention a bit easier.
Many names in IBM SAN Volume Controller and SAN environment can be modified online. Therefore, you do not need to worry about planning outages to implement your new naming convention. The naming examples that are used in the following sections are effective in most cases, but might not be fully adequate for your environment or needs. The naming convention to use is your choice, but you must implement it in the entire environment.
Enclosures, nodes and external storage controllers,
IBM SAN Volume Controller names its internal nodes as nodeX, with X being a sequential decimal number. These numbers range 2 - 8, in a four IBM SAN Volume Controller system cluster.
If multiple external controllers are attached to your IBM SAN Volume Controller solution, these controllers are detected as controllerX; therefore, you might need to change the name so that it includes, for example, the vendor name, the model, or its serial number. Therefore, if you receive an error message that points to controllerX, you do not need to log in to IBM SAN Volume Controller to know which storage controller to check.
 
Note: An IBM SAN Volume Controller detects external controllers that are based on their worldwide node name (WWNN). If you have an external storage controller that has one WWNN for each worldwide port name (WWPN), this configuration might lead to many controllerX names pointing to the same physical box. In this case, prepare a naming convention to cover this situation.
MDisks and storage pools
When an IBM SAN Volume Controller detects new MDisks, it names them by default as mdiskXX, where XX is a sequential number. You should change the XX value to something more meaningful. MDisks are arrays (DRAID) from internal storage or volumes from an external storage system.
Ultimately, it comes down to personal preference and what works in your environment. The main “convention” that you must follow is to avoid the use of special characters in names, apart from the underscore, the hyphen, and the period (which are permitted), and spaces (which can make scripting difficult).
For example, you can change it to include the following information:
For internal MDisks, refer to the IBM SAN Volume Controller system or cluster name.
A reference to the external storage controller to which it belongs to (such as its serial number or last digits).
The extpool, array, or RAID group that it belongs to in the storage controller.
The LUN number or name it has in the storage controller.
Consider the following examples of MDisk names with this convention:
FS9200CL01-MD03, where FS9200CL01 is the system or cluster name, and MD03 is the MDisk name.
23K45_A7V10, where 23K45 is the serial number, 7 is the array, and 10 is the volume.
75VXYZ1_02_0206, where 75VXYZ1 is the serial number, 02 is the extpool, and 0206 is the LUN.
Storage pools have several different possibilities. One possibility is to include the storage controller, type of back-end disks if external, RAID type, and sequential digits. If you use dedicated pools for specific applications or servers, another possibility is to use them instead.
Consider the following examples:
FS9200-POOL01, where FS9200 is the system or cluster name, and POOL01 is the pool.
P05XYZ1_3GR5, where Pool 05 from serial 75VXYZ1, LUNs with 300 GB FC DDMs, and RAID 5.
P16XYZ1_EX01, where Pool 16 from serial 75VXYZ1, and pool 01 dedicated to Exchange Mail servers.
XIV01_F9H02_ET, where Pool with disks is from XIV named XIV01 and FlashSystem 900 F9H02, which are both managed by Easy Tier.
Volumes
Volume names should include the following information:
The host or cluster to which the volume is mapped.
A single letter that indicates its usage by the host, as shown in the following examples:
 – B: For a boot disk, or R for a rootvg disk (if the server boots from SAN)
 – D: For a regular data disk
 – Q: For a cluster quorum disk (do not confuse with IBM SAN Volume Controller quorum disks)
 – L: For a database log disk
 – T: For a database table disk
A few sequential digits, for uniqueness.
Sessions standard for VMware datastores:
 – esx01-sessions-001: For a datastore composed of a single volume
 – esx01-sessions-001a and esx01-sessions-001b: For a datastore composed of 2 volumes
For example, ERPNY01-T03 indicates a volume that is mapped to server ERPNY01 and database table disk 03.
Hosts
In today’s environment, administrators deal with large networks, the internet, and cloud computing. Use good server naming conventions so that they can quickly identify a server and determine the following information:
Where it is (to know how to access it).
What kind it is (to determine the vendor and support group in charge).
What it does (to engage the proper application support and notify its owner).
Its importance (to determine the severity if problems occur).
Changing a server’s name in IBM SAN Volume Controller is as simple as changing any other IBM SAN Volume Controller object name. However, changing the name on the operating system of a server might have implications for application configuration or DNS and might require a server reboot. Therefore, you might want to prepare a detailed plan if you decide to rename several servers in your network.
The following example is for a server naming convention of LLAATRFFNN where:
LL is the location, which might designate a city, data center, building floor, or room.
AA is a major application; for example, billing, ERP, and Data Warehouse.
T is the type; for example, UNIX, Windows, and VMware.
R is the role; for example, Production, Test, Q&A, and Development.
FF is the function; for example, DB server, application server, web server, and file server.
NN is numeric.
SAN aliases and zones
SAN aliases often must reflect only the device and port that is associated to it. Including information about where one specific device port is physically attached on the SAN might lead to inconsistencies if you make a change or perform maintenance and then forget to update the alias. Create one alias for each device port WWPN in your SAN and use these aliases in your zoning configuration.
Consider the following examples:
AIX_NYBIXTDB02_FC2: Interface fcs2 of AIX server NYBIXTDB02.
LIN-POKBIXAP01-FC1: Interface fcs1 of Linux Server POKBIXAP01.
WIN_EXCHSRV01_HBA1: Interface HBA1 of physical Windows server EXCHSRV01.
ESX_NYVMCLUSTER01_VMHBA2: Interface vmhba2 of ESX server NYVMCLUSTER01.
IBM-NYFS9200-N1P1_HOST: Port 1 of Node 1 from FS9200 Cluster NYFS9200 dedicated for hosts.
IBM-NYFS9200-N1P5_INTRA: Port 5 of Node 1 from FS9200 Cluster NYFS9200 dedicated to intracluster traffic.
IBM-NYFS9200-N1P7_REPL: Port 7 of Node 1 from FS9200 Cluster NYFS9200 dedicated to replication.
Be mindful of the IBM SAN Volume Controller port aliases. There are mappings between the last digits of the port WWPN and the node FC port.
IBM_D88870_75XY131_I0301: DS8870 serial number75XY131, port I0301.
TS4500-TD06: TS4500 tape library, tape drive 06.
EMC_VNX7500_01_SPA2: EMC VNX7500 hostname VNX7500_01, SP A, port 2.
If your SAN does not support aliases, for example, in heterogeneous fabrics with switches in some interoperation modes, use WWPNs in your zones. However, remember to update every zone that uses a WWPN if you ever change it.
Have your SAN zone name reflect the devices in the SAN it includes (normally in a one-to-one relationship) as shown in the following examples:
SERVERALIAS_T1_FS9200CLUSTERNAME (from a server to the IBM FlashSystem 9200, where you use T1 as an identifier to zones that uses, for example, node ports P1 on Fabric A, and P2 on Fabric B).
SERVERALIAS_T2_FS9200CLUSTERNAME (from a server to the IBM FlashSystem 9200, where you use T2 as an identifier to zones that uses, for example, node ports P3 on Fabric A, and P4 on Fabric B).
IBM_DS8870_75XY131_FS9200CLUSTERNAME (zone between an external back-end storage and the IBM FlashSystem 9200).
NYC_FS9200_POK_FS9200_REPLICATION (for Remote Copy services).
10.13.2 SAN fabric documentation
The most basic piece of SAN documentation is a SAN diagram. It is likely to be one of the first pieces of information you need if you ever seek support from your SAN switches vendor. Also, a good spreadsheet with ports and zoning information eases the task of searching for detailed information, which, if included in the diagram, makes the diagram easier to use.
Brocade SAN Health
The Brocade SAN Health Diagnostics Capture tool is a no-cost, automated tool that can help you retain this documentation. SAN Health consists of a data collection tool that logs in to the SAN switches that you indicate and collects data by using standard SAN switch commands. The tool then creates a compressed file with the data collection. This file is sent to a Brocade automated machine for processing by secure web or email.
After some time (typically a few hours), you receive an email with instructions about how to download the report. The report includes a Visit diagram of your SAN and an organized Microsoft Excel spreadsheet that contains all your SAN information. For more information and to download the tool, see this web page.
The first time that you use the SAN Health Diagnostics Capture tool, explore the options that are provided to learn how to create a well-organized and useful diagram.
Figure 10-20 on page 487 shows an example of a poorly formatted diagram.
Figure 10-20 A poorly formatted SAN diagram
Figure 10-21 shows a tab of the SAN Health Options window in which you can choose the format of SAN diagram that best suits your needs. Depending on the topology and size of your SAN fabrics, you might want to manipulate the options in the Diagram Format or Report Format tabs.
Figure 10-21 Brocade SAN Health Options window
SAN Health supports switches from manufacturers other than Brocade, such as Cisco. Both the data collection tool download and the processing of files are available at no cost. You can download Microsoft Visit and Excel viewers at no cost from the Microsoft website.
Another tool, which is known as SAN Health Professional, is also available for download at no cost. With this tool, you can audit the reports in detail by using advanced search functions and inventory tracking. You can configure the SAN Health Diagnostics Capture tool as a Windows scheduled task.
This tool is available for download at this web page.
 
Tip: Regardless of the method that is used, generate a fresh report at least once a month or after any major changes are made. Keep previous versions so that you can track the evolution of your SAN.
IBM Spectrum Control reporting
If you have IBM Spectrum Control running in your environment, you can use it to generate reports on your SAN. For more information about how to configure and schedule IBM Spectrum Control reports, see this IBM Documentation web page.
Also, see Chapter 9, “Implementing a storage monitoring system” on page 373, for more information about how to configure and set up Spectrum Control.
Ensure that the reports that you generate include all of the information that you need. Schedule the reports with a period that you can use to backtrack any changes that you make.
10.13.3 IBM SAN Volume Controller documentation
You can back up the configuration data for an IBM SAN Volume Controller system after preliminary tasks are completed. Configuration data for the system provides information about your system and the objects that are defined in it. It also contains the configuration data of arrays, pools, volumes, and so on. The backup does not contain any data from the volumes themselves.
Before you back up your configuration data, the following prerequisites must be met:
No independent operations that change the configuration for the system can be running while the backup command is running.
No object name can begin with an underscore character (_).
 
Note: The system automatically creates a backup of the configuration data each day at 1 AM. This backup is known as a cron backup and on the configuration node is copied to /dumps/svc.config.cron.xml_<serial#>.
Complete the following steps to generate a manual backup at anytime:
1. Run the svcconfig backup command to back up your configuration. The command displays messages similar to the messages that are shown in Example 10-9.
Example 10-9 Sample svcconfig backup command output
IBM_2145:IBM Redbook SVC:superuser>svcconfig backup
..................................................................................
..................................................................................
............................................................................
CMMVC6155I SVCCONFIG processing completed successfully
The svcconfig backup command creates three files that provide information about the backup process and the configuration. These files are created in the /tmp directory and copied to/dumps directory of the configuration node. You can use the lsdumps command to list them. Table 10-3 lists the three files that are created by the backup process.
Table 10-3 Files created by the backup process
File name
Description
svc.config.backup.xml_<serial#>
Contains your configuration data.
svc.config.backup.sh_<serial#>
Contains the names of the commands that were issued to create the backup of the system.
svc.config.backup.log_<serial#>
Contains details about the backup, including any reported errors or warnings.
2. Check that the svcconfig backup command completes successfully and examine the command output for any warnings or errors. The following output is an example of the message that is displayed when the backup process is successful:
CMMVC6155I SVCCONFIG processing completed successfully
3. If the process fails, resolve the errors and run the command again.
4. Keep backup copies of the files outside the system to protect them against a system hardware failure. With Microsoft Windows, use the PuTTY pscp utility. With UNIX or Linux, you can use the standard scp utility. By using the -unsafe option, you can use a wild card to download all the svc.config.backup files by using a single command. Example 10-10 shows the output of the pscp command.
Example 10-10 Saving the confide backup files to your workstation
C:>
pscp -unsafe [email protected]:/dumps/svc.config.backup.* C:
Using keyboard-interactive authentication.
Password:
svc.config.backup.log_78E | 33 kB | 33.6 kB/s | ETA: 00:00:00 | 100%
svc.config.backup.sh_78E0 | 13 kB | 13.9 kB/s | ETA: 00:00:00 | 100%
svc.config.backup.xml_78E | 312 kB | 62.5 kB/s | ETA: 00:00:00 | 100%
C:>
The configuration backup file is in XML format and can be inserted as an object into your IBM SAN Volume Controller documentation spreadsheet. The configuration backup file might be quite large; for example, it contains information about each internal storage drive that is installed in the system.
 
Note: Directly importing the file into your IBM SAN Volume Controller documentation spreadsheet might make it unreadable.
Also consider collecting the output of specific commands. At a minimum, collect the output of the following commands:
svcinfo lsfabric
svcinfo lssystem
svcinfo lsmdisk
svcinfo lsmdiskgrp
svcinfo lsvdisk
svcinfo lshost
svcinfo lshostvdiskmap
 
Note: Most of these CLI commands work without the svcinfo prefix; however, some commands do not work with only the short-name, and require the svcinfo prefix to be added.
Import the commands into the master spreadsheet, preferably with the output from each command on a separate sheet.
One way to automate either task is to first create a batch file (Windows), shell script (UNIX or Linux), or playbook (Ansible) that collects and stores this information. Then, use spreadsheet macros to import the collected data into your IBM SAN Volume Controller documentation spreadsheet.
When you are gathering IBM SAN Volume Controller information, consider the following preferred practices:
If you are collecting the output of specific commands, use the -delim option of these commands to make their output delimited by a character other than tab, such as comma, colon, or exclamation mark. You can import the temporary files into your spreadsheet in comma-separated values (CSV) format, specifying the same delimiter.
 
Note: It is important to use a delimiter that is not part of the output of the command. Commas can be used if the output is a specific type of list. Colons might be used for special fields, such as IPv6 addresses, WWPNs, or ISCSI names.
If you are collecting the output of specific commands, save the output to temporary files. To make your spreadsheet macros simpler, you might want to preprocess the temporary files and remove any “garbage” or unwanted lines or columns. With UNIX or Linux, you can use commands, such as grep, sed, and awk. Freeware software is available for Windows with the same commands, or you can use any batch text editor tool.
The objective is to fully automate this procedure so you can schedule it to run automatically on a regular basis. Make the resulting spreadsheet easy to consult and have it contain only the information that you use frequently. The automated collection and storage of configuration and support data (which is typically more extensive and difficult to use) is described in 10.13.7, “Automated support data collection” on page 493.
10.13.4 Storage documentation
You must generate documentation of your back-end storage controllers after configuration. Then, you can update the documentation when these controllers receive hardware or code updates. As such, there is little point to automating this back-end storage controller documentation. The same applies to the IBM SAN Volume Controller internal drives and enclosures.
Any portion of your external storage controllers that is used outside the IBM SAN Volume Controller solution might have its configuration changed frequently. In this case, see your back-end storage controller documentation for more information about how to gather and store the information that you need.
Fully allocate all of the available space in any of the optional external storage controllers that you might use as extra backend to the IBM SAN Volume Controller solution. This way, you can perform all your disk storage management tasks by using the IBM SAN Volume Controller user interface.
10.13.5 Technical support information
If you must open a technical support incident for your storage and SAN components, create and keep available a spreadsheet with all relevant information for all storage administrators. This spreadsheet includes the following information:
Hardware:
 – Vendor, machine and model number, serial number (example: IBM 2145-SV2 S/N 7812345)
 – Configuration, if applicable
 – Current code level
Physical location:
 – Data center, including the complete street address and phone number
 – Equipment physical location (room number, floor, tile location, and rack number)
 – Vendor’s security access information or procedure, if applicable
 – Onsite person’s contact name and phone or page number
Support contract:
 – Vendor contact phone numbers and website
 – Customer’s contact name and phone or page number
 – User ID to the support website, if applicable
 – Do not store the password in the spreadsheet under any circumstances
 – Support contract number and expiration date
By keeping this data on a spreadsheet, storage administrators have all the information that they need to complete a web support request form or to provide to a vendor’s call support representative. Typically, you are asked first for a brief description of the problem and then asked later for a detailed description and support data collection.
10.13.6 Tracking incident and change tickets
If your organization uses an incident and change management and tracking tool (such as IBM Tivoli Service Request Manager®), you or the storage administration team might need to develop proficiency in its use for the following reasons:
If your storage and SAN equipment are not configured to send SNMP traps to this incident management tool, manually open incidents whenever an error is detected.
The IBM SAN Volume Controller can be managed by the IBM Storage Insights (SI) tool, which is available free of charge to owners of IBM storage systems. The SI tool allows you to monitor all the IBM storage devices’ information that is on SI.
Disk storage allocation and deallocation and SAN zoning configuration modifications should be handled under properly submitted and approved change requests.
If you are handling a problem yourself, or calling your vendor’s technical support, you might need to produce a list of the changes that you recently implemented in your SAN or that occurred since the documentation reports were last produced or updated.
When you use incident and change management tracking tools, adhere to the following guidelines for IBM SAN Volume Controller and SAN Storage Administration:
Whenever possible, configure your storage and SAN equipment to send SNMP traps to the incident monitoring tool so that an incident ticket is automatically opened, and the suitable alert notifications are sent. If you do not use a monitoring tool in your environment, you might want to configure email alerts that are automatically sent to the mobile phones or pagers of the storage administrators on duty or on call.
Discuss within your organization the risk classification that a storage allocation or deallocation change request is to have. These activities are typically safe and nondisruptive to other services and applications when properly handled.
However, they have the potential to cause collateral damage if a human error or an unexpected failure occurs during implementation. Your organization might decide to assume more costs with overtime and limit such activities to off-business hours, weekends, or maintenance windows if they assess that the risks to other critical applications are too high.
Use templates for your most common change requests, such as storage allocation or SAN zoning modification, to facilitate and speed up their submission.
Do not open change requests in advance to replace failed, redundant, hot-pluggable parts, such as disk drive modules (DDMs) in storage controllers with hot spares, or SFPs in SAN switches or servers with path redundancy. Typically, these fixes do not change anything in your SAN storage topology or configuration or t cause any more service disruption or degradation than you had when the part failed. Handle these fixes within the associated incident ticket because it might take longer to replace the part if you need to submit, schedule, and approve a non-emergency change request.
An exception is if you must interrupt more servers or applications to replace the part. In this case, you must schedule the activity and coordinate support groups. Use good judgment and avoid unnecessary exposure and delays.
Keep handy the procedures to generate reports of the latest incidents and implemented changes in your SAN Storage environment. Typically, you do not need to periodically generate these reports because your organization probably already has a Problem and Change Management group that runs such reports for trend analysis purposes.
10.13.7 Automated support data collection
In addition to the easier-to-use documentation of your IBM SAN Volume Controller and SAN Storage environment, collect and store for some time the configuration files and technical support data collection for all your SAN equipment.
For IBM SAN Volume Controller, this information includes snap data. For other equipment, see the related documentation for more information about how to gather and store the support data that you might need.
You can create procedures that automatically create and store this data on scheduled dates, delete old data, or transfer the data to tape.
IBM Storage Insights also no can be used to create support tickets and then attach the snap data to this record from within the SI GUI. For more information, see Chapter 11, “Troubleshooting and diagnostics” on page 495.
10.13.8 Subscribing to IBM SAN Volume Controller support
Subscribing to IBM SAN Volume Controller support is like the most overlooked practice in IT administration, and yet it is the most efficient way to stay ahead of problems. With this subscription, you can receive notifications about potential threats before they can reach you and cause severe service outages.
For more information about subscribing to this support and receiving support alerts and notifications for your products, see this IBM Support web page. (Create an IBM ID if you do not have one.)
You can subscribe to receive information from each vendor of storage and SAN equipment from the IBM website. You can often quickly determine whether an alert or notification is applicable to your SAN storage. Therefore, open any alert or notification when you receive them and keep them in a folder of your mailbox.
Sign up and tailor the requests and alerts you wants to receive. For example, enter IBM SAN Volume Controller in the Product lookup text box and then, click Subscribe to subscribe to SAN Volume Controller notifications, as shown in Figure 10-22.
Figure 10-22 Creating a subscription to IBM SAN Volume Controller notifications
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.118.232