Chapter 1. Introduction and system overview

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Introduction and system overview

This chapter defines the concept of storage virtualization and provides an overview of its application in addressing the challenges of modern storage environment. It also contains an overview of each of the products that make up the IBM SAN Volume Controller family.

This chapter includes the following topics:

•1.1, “Storage virtualization terminology” on page 2

•1.2, “Latest changes and enhancements” on page 5

•1.3, “IBM SAN Volume Controller architecture” on page 6

•1.4, “IBM SAN Volume Controller models” on page 12

•1.5, “IBM SAN Volume Controller components” on page 16

•1.6, “Business continuity” on page 40

•1.7, “Management and support tools” on page 44

•1.8, “Useful IBM SAN Volume Controller web links” on page 47

1.1 Storage virtualization terminology

Storage virtualization is a term that is used extensively throughout the storage industry. It can be applied to various technologies and underlying capabilities. In reality, most storage devices technically can claim to be virtualized in one form or another. Therefore, this chapter starts by defining the concept of storage virtualization as it is used in this book.

We describe storage virtualization in the following way:

•Storage virtualization is a technology that makes one set of resources resemble another set of resources, preferably with more wanted characteristics.

•Storage virtualization is a logical representation of resources that is not constrained by physical limitations and hides part of the complexity. It also adds or integrates new functions with services, and can be nested or applied to multiple layers of a system.

The virtualization model consists of the following layers:

•Application: The user of the storage domain.

•Storage domain:

– File, record, and namespace virtualization and file and record subsystem

– Block virtualization

– Block subsystem

Applications typically read and write data as vectors of bytes or records. However, storage presents data as vectors of blocks of a constant size (512 or in the newer devices, 4096 bytes per block).

The file, record, and namespace virtualization and file and record subsystem layers convert records or files that are required by applications to vectors of blocks, which are the language of the block virtualization layer. The block virtualization layer maps requests of the higher layers to physical storage blocks, which are provided by storage devices in the block subsystem.

Each of the layers in the storage domain abstracts away complexities of the lower layers and hides them behind an easy-to-use, standard interface that is presented to upper layers. The resultant decoupling of logical storage space representation and its characteristics that are visible to servers (storage consumers) from underlying complexities and intricacies of storage devices is a key concept of storage virtualization.

The focus of this publication is block-level virtualization at the block virtualization layer, which is implemented by IBM as IBM Spectrum Virtualize Software that is running on IBM SAN Volume Controller and the IBM FlashSystem family. The IBM SAN Volume Controller is implemented as a clustered appliance in the storage network layer. The IBM FlashSystems are deployed as modular storage systems that can virtualize their internally and externally attached storage.

IBM Spectrum Virtualize uses the Small Computer System Interface (SCSI) protocol to communicate with its clients and presents storage space as SCSI logical units (LUs), which are identified by SCSI logical unit numbers (LUNs).

Note: Although LUs and LUNs are different entities, the term LUN in practice is often used to refer to a logical disk, that is, an LU.

Although most applications do not directly access storage but work with files or records, the operating system (OS) of a host must convert these abstractions to the language of storage, that is, vectors of storage blocks that are identified by logical block addresses (LBAs) within an LU.

Inside IBM Spectrum Virtualize, each of the externally visible LUs is internally represented by a volume, which is an amount of storage that is taken out of a storage pool. Storage pools are made of managed disks (MDisks), that is, they are LUs that are presented to the storage system by external virtualized storage or arrays that consist of internal disks. LUs that are presented to IBM Spectrum Virtualize by external storage usually correspond to RAID arrays that are configured on that storage.

The hierarchy of objects, from a file system block down to a physical block on a physical drive, is shown in Figure 1-1.

Figure 1-1 Block-level virtualization overview

With storage virtualization, you can manage the mapping between logical blocks within an LU that is presented to a host, and blocks on physical drives. This mapping can be as simple or as complicated as required by a use case. A logical block can be mapped to one physical block or for increased availability, multiple blocks that are physically stored on different physical storage systems, and in different geographical locations.

Importantly, the mapping can be dynamic: With IBM Easy Tier®, IBM Spectrum Virtualize can automatically change underlying storage to which groups of blocks (extent) are mapped to better match a host’s performance requirements with the capabilities of the underlying storage systems.

IBM Spectrum Virtualize gives a storage administrator a wide range of options to modify volume characteristics: from volume resize to mirroring, creating a point-in-time (PiT) copy with IBM FlashCopy®, and migrating data across physical storage systems. Importantly, all the functions that are presented to the storage users are independent from the characteristics of the physical devices that are used to store data. This decoupling of the storage feature set from the underlying hardware and ability to present a single, uniform interface to storage users that masks underlying system complexity is a powerful argument for adopting storage virtualization with IBM Spectrum Virtualize.

Storage virtualization is implemented on many layers. Figure 1-1 on page 3 shows an example where a file system block is mirrored by the host’s OS (left side of the figure) by using features of the logical volume manager (LVM) or the IBM Spectrum Virtualize system at the storage pool level (as shown on the right side of Figure 1-1 on page 3). Although the result is similar (the data block is written to two different arrays), the effort that is required for per-host configuration is disproportionately larger than for a centralized solution with organization-wide storage virtualization that is done on a dedicated system and managed from a single GUI.

IBM Spectrum Virtualize includes the following key features:

•Simplified storage management by providing a single management interface for multiple storage systems and a consistent user interface for provisioning heterogeneous storage.

•Online volume migration. IBM Spectrum Virtualize enables moving the data from one set of physical drives to another set in a way that is not apparent to the storage users and without over-straining the storage infrastructure. The migration can be done within a specific storage system (from one set of disks to another set) or across storage systems. Either way, the host that uses the storage is not aware of the operation, and no downtime for applications is needed.

•Enterprise-level Copy Services functions. Performing Copy Services functions within IBM Spectrum Virtualize removes dependencies on the capabilities and interoperability of the virtualized storage subsystems. Therefore, it enables the source and target copies to be on any two virtualized storage subsystems.

•Improved storage space usage because of the pooling of resources across virtualized storage systems.

•Opportunity to improve system performance as a result of volume striping across multiple virtualized arrays or controllers, and the benefits of cache that is provided by IBM Spectrum Virtualize hardware.

•Improved data security by using data-at-rest encryption.

•Data replication, including replication to cloud storage by using advanced copy services for data migration and backup solutions.

•Data reduction techniques, such as thin provisioning, deduplication, and compression for space efficiency, storing more data on the storage that is owned. IBM Spectrum Virtualize can enable significant savings, increase the effective capacity of storage systems up to five times, and decrease the floor space, power, and cooling that are required by the storage system

Note: IBM Real-time Compression (RtC) is only available for earlier generation engines. The newer IBM SAN Volume Controller engines (SV2 and SA2) do not support RtC however they support software compression through Data Reduction Pools (DRP).

Summary

Storage virtualization is a fundamental technology that enables the realization of flexible and reliable storage solutions. It helps enterprises to better align IT architecture with business requirements, simplify their storage administration, and facilitate their IT departments efforts to meet business demands.

IBM Spectrum Virtualize running on IBM SAN Volume Controller and IBM FlashSystem family is a mature, 10th-generation virtualization solution that uses open standards and complies with the SNIA storage model. All products use in-band block virtualization engines that move the control logic (including advanced storage functions) from a multitude of individual storage devices to a centralized entity in the storage network.

IBM Spectrum Virtualize can improve the usage of your storage resources, simplify storage management, and improve the availability of business applications.

1.2 Latest changes and enhancements

IBM Spectrum Virtualize 8.4 provides more features and updates to the IBM Spectrum Virtualize family of products of which IBM SAN Volume Controller is part. The following major software changes are in version 8.4 apply to IBM SAN Volume Controller:

•Data Reduction Pool (DRP) improvements;

– DRP allows for more flexibility, such as multi-tenancy.

– FlashCopy with redirect-on-write support use DRP’s internal deduplication referencing capabilities to reduce overhead by creating references instead of copying the data. Redirect-on-write (RoW) is an alternative to the copy-on-write (CoW) capabilities.

Note: At the time of this writing, this capability might be used only for volumes with supported deduplication without mirroring relationships and within the same pool and I/O group. The mode selection (RoW/CoW) is automatic based on these conditions.

– Comprestimator always on, which allows the systems to sample each volume at regular intervals. It provides the ability to display the compressibility of the data in the GUI and IBM Storage Insights at any time.

– RAID Reconstruct Read increases reliability and availability by reducing chances of DRP going offline because of fixable array issues. It uses RAID capabilities, and DRP asks for a specific data block reconstruction when detecting a potential corruption.

•Expansion of mirrored vDisks (also known as volumes) allows the vDisks capacity to be expanded or reduced online without requiring an offline format and sync. This change improves the availability of the volume for use because the new capacity is available immediately.

•Three-site replication with IBM HyperSwap® support provides improved availability for data in three-site implementations. This improvement expands on the Disaster Recovery (DR) capabilities that are inherent in this topology.

Note: Three-site replication by using Metro Mirror was supported on version 8.3.1 only in limited installations by way of the RPQ process. With 8.4.0, this implementation is generally available.

•Host attachment support with FC-NVMe in HyperSwap systems.

•DNS support for LDAP and NTP with full DNS length (that is, 256 characters)

•Updates to maximum configuration limits, which doubles FlashCopy mapping from 5,000 to 10,000 and increasing HyperSwap volumes limit from 1,250 to 2,000.

1.3 IBM SAN Volume Controller architecture

This section explains the major concepts underlying IBM SAN Volume Controller. It also describes the architectural overview and the terminologies that are used in a virtualized storage environment. Finally, it introduces the software and hardware components and the other functions that are available with Version 8.4.

1.3.1 IBM SAN Volume Controller architectural overview

IBM SAN Volume Controller is a SAN block aggregation virtualization appliance that is designed for attachment to various host computer systems.

The following major approaches are used today for the implementation of block-level aggregation and virtualization:

•Symmetric: In-band appliance

Virtualization splits the storage that is presented by the storage systems into smaller chunks that are known as extents. These extents are then concatenated by using various policies to make virtual disks (vDisks or volumes). With symmetric virtualization, host systems can be isolated from the physical storage. Advanced functions, such as data migration, can run without reconfiguring the host.

With symmetric virtualization, the virtualization engine is the central configuration point for the SAN. The virtualization engine directly controls access to the storage and to the data that is written to the storage. As a result, locking functions that provide data integrity and advanced functions (such as cache and Copy Services) can be run in the virtualization engine. Therefore, the virtualization engine is a central point of control for device and advanced function management.

Symmetric virtualization can have disadvantages. The main disadvantage that is associated with symmetric virtualization is scalability. Scalability can cause poor performance because all input/output (I/O) must flow through the virtualization engine.
To solve this problem, you can use an n-way cluster of virtualization engines that has failover capability.

You can scale the extra processor power, cache memory, and adapter bandwidth to achieve the level of performance that you want. More memory and processing power are needed to run advanced services, such as Copy Services and caching. IBM SAN Volume Controller uses symmetric virtualization. Single virtualization engines, which are known as nodes, are combined to create clusters. Each cluster can contain 2 - 8 nodes.

•Asymmetric: Out-of-band or controller-based

With asymmetric virtualization, the virtualization engine is outside the data path and performs a metadata-style service. The metadata server contains all of the mapping and the locking tables, and the storage devices contain only data. In asymmetric virtual storage networks, the data flow is separated from the control flow.

A separate network or SAN link is used for control purposes. Because the control flow is separated from the data flow, I/O operations can use the full bandwidth of the SAN. A separate network or SAN link is used for control purposes.

Asymmetric virtualization can have the following disadvantages:

– Data is at risk to increased security exposures, and the control network must be protected with a firewall.

– Metadata can become complicated when files are distributed across several devices.

– Each host that accesses the SAN must know how to access and interpret the metadata. Therefore, specific device drivers or agent software must be running on each of these hosts.

– The metadata server cannot run advanced functions such as caching or Copy Services because it only “knows” about the metadata and not about the data itself.

Figure 1-2 shows variations of the two virtualization approaches.

Figure 1-2 Overview of block-level virtualization architectures

Although these approaches provide essentially the same cornerstones of virtualization, interesting side-effects can occur.

The controller-based approach has high functionality, but it fails in terms of scalability or upgradeability. Because of the nature of its design, no true decoupling occurs with this approach, which becomes an issue for the lifecycle of this solution, such as with a controller. Data migration issues and questions are challenging, such as how to reconnect the servers to the new controller, and how to reconnect them online without any effect on your applications.

With this approach, you replace a controller and implicitly replace your entire virtualization solution. In addition to replacing the hardware, other actions (such as updating or repurchasing the licenses for the virtualization feature, and advanced copy functions) might be necessary.

With a SAN or fabric-based appliance solution that is based on a scale-out cluster architecture, lifecycle management tasks, such as adding or replacing new disk subsystems or migrating data between them, are simple.

Servers and applications remain online, data migration occurs transparently on the virtualization platform, and licenses for virtualization and copy services require no update. There are no other costs when disk subsystems are replaced.

Only the fabric-based appliance solution provides an independent and scalable virtualization platform that can provide enterprise-class Copy Services that is open for future interfaces and protocols. By using the fabric-based appliance solution, you can choose the disk subsystems that best fit your requirements, and you are not locked into specific SAN hardware.

For these reasons, IBM chose the SAN-based appliance approach with inline block aggregation for the implementation of storage virtualization with IBM Spectrum Virtualize.

IBM SAN Volume Controller includes the following key characteristics:

•It is highly scalable, which provides an easy growth path to two-n nodes (grow in a pair of nodes due to the cluster function).

•It is SAN interface-independent. It supports Fibre Channel (FC), FC-NVMe, iWARP, RoCE, Fibre Channel over Ethernet (FCoE), and internet Small Computer Systems Interface (iSCSI). It also is open for future enhancements.

•It is host-independent for fixed block-based Open Systems environments.

•It is external storage system independent, which provides a continuous and ongoing process to qualify more types of storage systems.

•Some nodes can use internal disks (flash drives) or externally direct-attached disks in expansion enclosures.

On the SAN storage that is provided by the disk subsystems, IBM SAN Volume Controller offers the following services:

•Creates a single pool of storage.

•Provides LU virtualization.

•Manages logical volumes.

•Mirrors logical volumes.

IBM SAN Volume Controller running IBM Spectrum Virtualize V8.4 also provides these functions:

•Large scalable cache.

•Copy Services.

•IBM FlashCopy (PiT copy) function, including thin-provisioned FlashCopy to make multiple targets affordable.

•IBM Transparent Cloud Tiering (TCT) function that enables IBM SAN Volume Controller to interact with cloud service providers (CSPs).

•Metro Mirror (MM) (synchronous copy).

•Global Mirror (GM) (asynchronous copy).

•Data migration.

•Space efficiency (Thin-provisioning, compression and deduplication).

•Easy Tier to automatically migrate data between storage types of different performance that is based on disk workload.

•Encryption of external attached storage.

•Supporting IBM HyperSwap.

•Supporting VMware vSphere Virtual Volumes (VVOLs) and Microsoft Offloaded Data Transfer (ODX).

•Direct attachment of hosts.

•Hot spare nodes with a standby function of single or multiple nodes.

•Containerization connectivity with Container Storage Interface (CSI), which enables supported storage to be used as persistent storage in container environments.

•Hybrid Multicloud functionality with IBM Spectrum Virtualize for Public Cloud.

1.3.2 IBM Spectrum Virtualize

IBM Spectrum Virtualize is a software-defined storage solution and a key member of the IBM Spectrum Storage portfolio. Built with IBM Spectrum Virtualize software, IBM SAN Volume Controller simplifies infrastructure and eliminates differences in management, function, and even hybrid multicloud support.

For more information, see the IBM Spectrum Storage portfolio web page.

Naming: With the introduction of the IBM Spectrum Storage family, the software that runs on IBM SAN Volume Controller and on IBM FlashSystem products is called IBM Spectrum Virtualize. The name of the underlying hardware platform remains intact.

IBM SAN Volume Controller systems helps organizations achieve better data economics by supporting these new workloads that are critical to their success. It can handle the massive volumes of data from mobile and social applications, enable rapid and flexible cloud services deployments, and deliver the performance and scalability that is needed to gain insights from the latest analytics technologies.

IBM Spectrum Virtualize is the core software engine of the entire family of IBM FlashSystem products. The contents of this book are intentionally related to the deployment considerations of IBM SAN Volume Controller.

Benefits of IBM Spectrum Virtualize

IBM Spectrum Virtualize delivers leading benefits that improve storage infrastructures in many ways, including:

•Cost reduction of storing data by increasing utilization and accelerating applications to speed business insights. To achieve this, the solution:

– Uses data reduction technologies to increase the amount of data you can store in the same space

– Enables rapid deployment of cloud storage for DR along with the ability to store copies of local data

– Moves data to the most suitable type of storage based on policies you define by using IBM Spectrum Control to optimize storage

– Improves storage performance so you can get more done with your data

•Data protection from theft or inappropriate disclosure while enabling a high-availability strategy that includes protection for data and application mobility and DR. To achieve this, the solution:

– Uses software-based encryption to improve data security

– Provides fully duplexed copies of data and automatic switch over across data centers to improve data availability

– Eliminates storage downtime with nondisruptive movement of data from one type of storage to another

•Provides a data strategy that is independent of your choice of infrastructure, and delivers tightly integrated functionality and consistent management across heterogeneous storage. To achieve this, the solution:

– Integrates with virtualization tools, such as VMware vCenter to improve agility with automated provisioning of storage and easy deployment of new storage technologies

– Enables supported storage to be deployed with Kubernetes and Docker container environments, including Red Hat OpenShift

– Consolidates storage regardless of hardware vendor for simplified management, consistent functionality, and greater efficiency

– Supports common capabilities across storage types, and provides flexibility in storage acquisition by allowing a mix of vendors in the storage infrastructure

Note: These benefits are not a complete list of features and functions that are available with IBM Spectrum Virtualize software; it is only a subset of its total capabilities.

1.3.3 IBM SAN Volume Controller topology

External storage can be managed by IBM SAN Volume Controller in one or more pairs of hardware nodes. This configuration is referred to as a clustered system. These nodes are normally attached to the SAN fabric, with storage and host systems. The SAN fabric is zoned to enable the IBM SAN Volume Controller to communicate with the external storage system and hosts.

Within this software release, IBM SAN Volume Controller also supports Internet Protocol networks. This feature enables the hosts and storage systems to communicate with IBM SAN Volume Controller to build a storage virtualization solution.

Typically, the hosts cannot see or operate on the same physical storage (LUN) from the storage system that is assigned to IBM SAN Volume Controller. If the same LUNs are not shared, storage systems can be shared between the IBM SAN Volume Controller and direct host access.

The zoning capabilities of the SAN switch must be used to create distinct zones to ensure that this rule is enforced. SAN fabrics can include standard FC, FCoE, iSCSI over Ethernet, or possible future types.

Figure 1-3 shows a conceptual diagram of a storage system that uses IBM SAN Volume Controller. It also shows several hosts that are connected to a SAN fabric or local area network (LAN). In practical implementations that have HA requirements (most of the target clients for IBM SAN Volume Controller), the SAN fabric cloud represents a redundant SAN. A redundant SAN consists of a fault-tolerant arrangement of two or more counterpart SANs, which provide alternative paths for each SAN-attached device.

Figure 1-3 IBM SAN Volume Controller conceptual and topology overview

Both scenarios (the use of a single network and the use of two physically separate networks) are supported for iSCSI-based and LAN-based access networks to IBM SAN Volume Controller. Redundant paths to volumes can be provided in both scenarios.

For simplicity, Figure 1-3 shows only one SAN fabric and two zones: host and storage. In a real environment, it is a best practice to use two redundant SAN fabrics. IBM SAN Volume Controller can be connected to up to four fabrics.

A clustered system of IBM SAN Volume Controller nodes that are connected to the same fabric presents logical disks or volumes to the hosts. These volumes are created from managed LUNs or MDisks that are presented by the storage systems.

The following distinct zones are shown in the fabric:

•A host zone, in which the hosts can see and address the IBM SAN Volume Controller nodes

•A storage zone, in which the IBM SAN Volume Controller nodes can see and address the MDisks or LUNs that are presented by the storage systems

As explained in 1.3.1, “IBM SAN Volume Controller architectural overview” on page 6, hosts are not permitted to operate on the RAID LUNs directly. All data transfer happens through the IBM SAN Volume Controller nodes. This flow is referred to as symmetric virtualization.

For iSCSI-based access, the use of two networks and separating iSCSI traffic within the networks by using a dedicated virtual local area network (VLAN) path for storage traffic prevents any IP interface, switch, or target port failure from compromising the iSCSI connectivity across servers and storage controllers.

1.4 IBM SAN Volume Controller models

The following IBM SAN Volume Controller engine models are supported at the IBM Spectrum Virtualize V8.4 code level:

•IBM SAN Volume Controller Model SV2

•IBM SAN Volume Controller Model SA2

•IBM SAN Volume Controller Model SV1

•IBM SAN Volume Controller Model DH8

1.4.1 IBM SAN Volume Controller SV2 and SA2

Figure 1-4 shows the front view of the IBM SAN Volume Controller SV2 and SA2.

Figure 1-4 IBM SAN Volume Controller SV2 and SA2 front view

Figure 1-5 shows the rear view of the IBM SAN Volume Controller SV2 / SA2.

Figure 1-5 IBM SAN Volume Controller SV2 and SA2 rear view

Figure 1-6 shows the internal hardware components of an IBM SAN Volume Controller SV2 and SA2 node canister. To the left is the front of the canister where fan modules and battery backup are located, followed by two Cascade Lake CPUs and Dual Inline Memory Module (DIMM) slots, and PCIe risers for adapters on the right.

Figure 1-6 Internal hardware components

Figure 1-7 shows the internal architecture of the IBM SAN Volume Controller SV2 and SA2 models. You can see that the PCIe switch is still present, but has no outbound connections because these models do not support any internal drives. The PCIe switch is used for internal monitoring purposes within the IBM SAN Volume Controller enclosure.

Figure 1-7 IBM SAN Volume Controller SV2 and SA2 internal architecture

Note: IBM SAN Volume Controller SV2 and SA2 do not support any type of expansion enclosures.

1.4.2 IBM SAN Volume Controller SV1

Figure 1-8 shows the front view of the IBM SAN Volume Controller SV1.

Figure 1-8 IBM SAN Volume Controller SV1 front view

Figure 1.4.3 shows the rear view of the IBM SAN Volume Controller SV1.

1.4.3 IBM SAN Volume Controller model comparisons

All of the IBM SAN Volume Controller models are all delivered in a 2U 19-inch rack-mounted enclosure. At the time of this writing, three models of the IBM SAN Volume Controller are available, as listed in Table 1-1.

More information: For the most up-to-date information about features, benefits, and specifications of the IBM SAN Volume Controller models, see this web page.

The information in this book is valid at the time of this writing and covers IBM Spectrum Virtualize V8.4. However, as IBM SAN Volume Controller matures, expect to see new features and enhanced specifications.

Table 1-1 IBM SAN Volume Controller base models

Feature	2145/2147¹-SV1	2145/2147-SV2	2145/2147-SA2
Processor	Two Intel Xeon E5 v4 Series, 8-cores, and 3.2 GHz	Two Intel Cascade Lake 5218 Series, 16-cores, and 2.30 GHz (Gold)	Two Intel Cascade Lake 4208 Series, 8-cores, and 2.10 GHz (Silver)
Base cache memory	64 GB	128 GB	128 GB
I/O ports and management	Three 10 Gb Ethernet ports for 10 Gb iSCSI connectivity and system management	Four 10 Gb Ethernet ports for 10 Gb iSCSI connectivity and system management	Four 10 Gb Ethernet ports for 10 Gb iSCSI connectivity and system management
Technician port	Single Gb Ethernet	Single 1 Gb Ethernet	Single 1 Gb Ethernet
Maximum host interface adapters slots	4	3	3
USB ports	4	2	2
SAS chain	2	N/A	N/A
Max number of dense drawers per SAS chain	4	N/A	N/A
Integrated battery units	2	1	1
Power supplies and cooling units	2	2	2

¹ Model 2147 is identical to 2145, but with an included enterprise support option from IBM.

The following optional features are available for IBM SAN Volume Controller SV1:

•A 256 GB cache upgrade fully unlocked with code Version 8.2 or later

•A 4-port 16 Gb FC adapter for 16 Gb FC connectivity

•A 4-port 10 Gb Ethernet adapter for 10 Gb iSCSI/FCoE connectivity

•Compression accelerator card for RtC

•A 4-port 12 Gb SAS expansion enclosure attachment card

Important: IBM SAN Volume Controller model 2145/2147-SV1 can contain a 16 Gb FC or a 10 Gb Ethernet adapter, but only one 10 Gbps Ethernet adapter is supported.

The following optional features are available for IBM SAN Volume Controller SV2 and SA2:

•A 768 GB cache upgrade

•A 4-port 16 Gb FC/FC over NVMe adapter for 16 Gb FC connectivity

•A 4-port 32 Gb FC/FC over NVMe adapter for 32 Gb FC connectivity

•A 2-port 25 Gb iSCSI/iSER/RDMA over Converged Ethernet (RoCE)

•A 2-port 25 Gb iSCSI/iSER/internet Wide-area RDMA Protocol (iWARP)

Note: The 25 Gb adapters are NVMe capable; however, to support NVMe, a software dependency exists (at the time of this writing); therefore, NVMe/NVMeoF is not supported on these cards.

The SV2 and SA2 systems have dual CPU sockets and three adapter slots along with four 10-GbE RJ45 ports on board.

Note: IBM SAN Volume Controller models SA2 and SV2 do not support FCoE.

The comparison of current and previous models of IBM SAN Volume Controller is shown in Table 1-2. Expansion enclosures are not included in the list.

Table 1-2 Historical overview of IBM SAN Volume Controller models

Model	Cache (GB]	FC (Gbps)	iSCSI (Gbps)	Hardware base	Announced
2145-DH8	32 - 64	8 and 16	1, optional 10	x3550 M4	06 May 2014
2145-SV1	64 - 256	16	10	Xeon E5 v4	23 August 2016
2147-SV1	64 - 256	16	10	Xeon E5 v4	23 August 2016
2145-SV2	128 - 768	16 and 32	25, 50, and 100	Intel Xeon Cascade Lake	06 March 2020
2147-SV2	128 - 768	16 and 32	25, 50, and 100	Intel Xeon Cascade Lake	06 March 2020
2145-SA2	128 - 768	16 and 32	25, 50, and 100	Intel Xeon Cascade Lake	06 March 2020
2147-SA2	128 - 768	16 and 32	25, 50, and 100	Intel Xeon Cascade Lake	06 March 2020

The IBM SAN Volume Controller SV1 expansion enclosure consists of an enclosure and drives. Each enclosure contains two canisters that can be replaced and maintained independently. IBM SAN Volume Controller SV1 support three models of expansion enclosures: 12F, 24F, and 92F dense drawers.

Note: IBM SAN Volume Controller SV2 and SA2 do not support any type of SAS expansion enclosures.

Expansion Enclosure model 12F features two expansion canisters and holds up to 12 3.5-inch SAS drives in a 2U 19-inch rack mount enclosure.

Expansion Enclosure model 24F supports up to 24 internal flash drives, 2.5-inch SAS drives, or a combination of such drives. Expansion Enclosure 24F also features two expansion canisters in a 2U 19-inch rack-mounted enclosure.

Expansion Enclosure model 92F supports up to 92 3.5-inch drives in a 5U 19-inch rack-mounted enclosure. This model is known as dense expansion drawers, or dense drawers.

1.5 IBM SAN Volume Controller components

IBM SAN Volume Controller provides block-level aggregation and volume management for attached disk storage. In simpler terms, IBM SAN Volume Controller manages several back-end storage controllers or locally attached disks.

IBM SAN Volume Controller maps the physical storage within those controllers or storage systems into logical disk images, or volumes, that can be seen by application servers and workstations in the SAN. It logically sits between hosts and storage systems. It presents itself to hosts as the storage provider (target) and to storage systems as one large host (initiator).

The SAN is zoned such that the application servers cannot “see” the back-end storage or controller. This configuration prevents any possible conflict between IBM SAN Volume Controller and the application servers that are trying to manage the back-end storage.

The IBM SAN Volume Controller is based on the components that are described next.

1.5.1 Nodes

Each IBM SAN Volume Controller hardware unit is called a node. Each node is an individual server in an IBM SAN Volume Controller clustered system on which the Spectrum Virtualize software runs. The node provides the virtualization for a set of volumes, cache, and copy services functions. The IBM SAN Volume Controller nodes are deployed in pairs (io_group), and one or multiple pairs constitute a clustered system or system. A system can consist of one pair and a maximum of four pairs.

One of the nodes within the system is known as the configuration node. The configuration node manages the configuration activity for the system. If this node fails, the system chooses a new node to become the configuration node.

Because the active nodes are installed in pairs, each node provides a failover function to its partner node if a node fails.

1.5.2 I/O groups

Each pair of IBM SAN Volume Controller nodes is also referred to as an I/O group. An IBM SAN Volume Controller clustered system can have 1 - 4 I/O groups.

A specific volume is always presented to a host server by a single I/O group of the system. The I/O group can be changed.

When a host server performs I/O to one of its volumes, all the I/Os for a specific volume are directed to one specific I/O group in the system. Under normal conditions, the I/Os for that specific volume are always processed by the same node within the I/O group. This node is referred to as the preferred node for this specific volume.

Both nodes of an I/O group act as the preferred node for their own specific subset of the total number of volumes that the I/O group presents to the host servers. However, both nodes also act as failover nodes for their respective partner node within the I/O group. Therefore, a node takes over the I/O workload from its partner node when required.

In an IBM SAN Volume Controller-based environment, the I/O handling for a volume can switch between the two nodes of the I/O group. Therefore, it is a best practice that servers are connected to two different fabrics through different FC host bus adapters (HBAs) to use multipath drivers to give redundancy.

The IBM SAN Volume Controller I/O groups are connected to the SAN so that all application servers that are accessing volumes from this I/O group can access this group. Up to 512 host server objects can be per I/O group. The host server objects can access volumes that are provided by this specific I/O group.

If required, host servers can be mapped to more than one I/O group within the IBM SAN Volume Controller system. Therefore, they can access volumes from separate I/O groups. You can move volumes between I/O groups to redistribute the load between the I/O groups. Modifying the I/O group that services the volume can be done concurrently with I/O operations if the host supports nondisruptive volume moves.

It also requires a rescan at the host level to ensure that the multipathing driver is notified that the allocation of the preferred node changed, and the ports (by which the volume is accessed) changed. This modification can be done in the situation where one pair of nodes becomes overused.

1.5.3 System

The system or clustered system consists of 1 - 4 I/O groups. Specific configuration limitations are then set for the entire system. For example, the maximum number of volumes that is supported per system is 10,000, or the maximum number of MDisks that is supported is
~28 PiB (32 PB) per system.

All configuration, monitoring, and service tasks are performed at the system level. Configuration settings are replicated to all nodes in the system. To facilitate these tasks, a management IP address is set for the system.

A process is provided to back up the system configuration data on to storage so that it can be restored if there is a disaster. This method does not back up application data. Only the
IBM SAN Volume Controller system configuration information is backed up.

For remote data mirroring, two or more systems must form a partnership before relationships between mirrored volumes are created.

For more information about the maximum configurations that apply to the system, I/O group, and nodes, search for “Configuration Limits and Restrictions for IBM System Storage IBM SAN Volume Controller” at IBM Support home page.

1.5.4 Expansion enclosures

Expansion enclosures are rack-mounted hardware that contains several components of the system: canisters, drives, and power supplies. Enclosures can be used to extend the capacity of the system. They are supported only on IBM SAN Volume Controller 2145 or 2147-SV1 nodes. For other models of the system, you must use external storage systems to provide capacity for data. The term enclosure is also used to describe the hardware and other parts that are plugged into the enclosure.

Expansion enclosures can be added to the IBM SAN Volume Controller 2145 or 2147-SV1 controller nodes to expand the available capacity of the system. (Other controller models do not support expansion enclosures.) Each system can have a maximum of four I/O groups, with two chains of expansion enclosures that are attached to each I/O group.

On each SAS chain, the systems can support up to a SAS chain weight of 10. Each 2145 / 2147-92F Expansion Enclosure adds a value of 2.5 to the SAS chain weight. Each 2145 / 2147-12F or 2145 / 2147-24F Expansion Enclosure adds a value of 1 to the SAS chain weight. For example, each of the following expansion enclosure configurations has a total SAS weight of 10:

•Four 2145 or 2147-92F enclosures per SAS chain

•Ten 2145 or 2147-12F enclosures per SAS chain

•Two 2145 or 2147-92F enclosures and five 2145 or 2147-24F enclosures per SAS chain

An expansion enclosure houses the following additional hardware: power supply units (PSUs), canisters, and drives. Enclosure objects report the connectivity of the enclosure. Each expansion enclosure is assigned a unique ID. It is possible to change the enclosure ID later.

Figure 1-9 on page 19 shows the front view of the 2145 or 2147 model 12F Expansion Enclosure.

Figure 1-9 IBM SAN Volume Controller Expansion Enclosure front view 2145 / 2147 model 12F

The drives are positioned in four columns of three horizontally mounted drive assemblies. The drive slots are numbered 1 - 12, starting at the upper left and moving left to right, top to bottom.

Figure 1-10 shows the front view of the 2145 or 2147 model 24F Expansion Enclosures.

Figure 1-10 IBM SAN Volume Controller Expansion Enclosure front view 2145 / 2147 model 24F

The drives are positioned in one row of 24 vertically mounted drive assemblies. The drive slots are numbered 1 - 24, starting from the left. A vertical center drive bay molding is between slots 8 and 9 and another between slots 16 and 17.

Note: Any references to internal drives or internal drive functions in this book do not apply to IBM SAN Volume Controller Models SV2 and SA2 because these models do not support internal drives nor any type of expansion enclosure. These functions might apply to externally virtualized systems, and therefore be available.

Dense expansion drawers

Dense expansion drawers, also known as dense drawers, are optional disk expansion enclosures that are 5U rack-mounted. Each chassis features two expansion canisters, two power supplies, two expander modules, and a total of four fan modules.

Each dense drawer can hold up to 92 drives that are positioned in four rows of 14 and another three rows of 12 mounted drives assemblies. The two Secondary Expander Modules (SEMs) are centrally located in the chassis. One SEM addresses 54 drive ports, and the other addresses 38 drive ports. Dense drawers can support many different drive sizes and types.

Each canister in the dense drawer chassis features two SAS ports that are numbered 1 and 2. The use of SAS port1 is mandatory because the expansion enclosure must be attached to an IBM SAN Volume Controller node or another expansion enclosure. SAS connector 2 is optional because it is used to attach to more expansion enclosures.

Note: IBM SAN Volume Controller SV2 and SA2 do not support any type of expansion enclosures.

Figure 1-11 shows a dense expansion drawer.

Figure 1-11 Dense expansion drawer

1.5.5 Flash drives

Flash drives can be used to overcome a growing problem that is known as the memory bottleneck or storage bottleneck specifically in single-layer cell (SLC) or multi-level cell (MLC) NAND flash-based disks.

Storage bottleneck problem

The memory or storage bottleneck describes the steadily growing gap between the time that is required for a CPU to access data that is in its cache memory (typically in nanoseconds) and data that is on external storage (typically in milliseconds).

Although CPUs and cache and memory devices continually improve their performance, mechanical disks that are used as external storage generally do not improve their performance.

Figure 1-12 shows these access time differences.

Figure 1-12 The memory or storage bottleneck

The actual times that are shown are not that important, but a noticeable difference exists between accessing data that is in cache and data that is on an external disk.

In the example that is shown in Figure 1-12 on page 20, we added a scale that gives you an idea of how long it takes to access the data in a scenario where a single CPU cycle takes
1 second. This scale shows the importance of future storage technologies closing or reducing the gap between access times for data that is stored in cache or memory versus access times for data that is stored on an external medium.

Since magnetic disks were first introduced by IBM in 1956 (the Random Access Memory Accounting System, also known as the IBM 305 RAMAC), they showed remarkable performance regarding capacity growth, form factor, and size reduction; price savings (cost per GB); and reliability.

However, the number of I/Os that a disk can handle and the response time that it takes to process a single I/O did not improve at the same rate, although they certainly did improve. In actual environments, you can expect from today’s enterprise-class FC SAS disk up to 200 input/output operations per second (IOPS) per disk with an average response time (latency) of approximately 6 ms per I/O.

Table 1-3 shows a comparison of drive types and IOPS.

Table 1-3 Comparison of drive types to IOPS

Drive type	IOPS
NL - SAS	100
SAS 10,000 revolutions per minute (RPM)	150
SAS 15,000 RPM	250
Flash	> 500,000 read; 300,000 write

Today’s spinning disks continue to advance in capacity, up to several terabytes, form factor/footprint (8.89 cm (3.5 inches), 6.35 cm (2.5 inches), and 4.57 cm (1.8 inches)), and price (cost per gigabyte), but they are not getting much faster.

The limiting factor is the number of RPM that a disk can perform (approximately 15,000). This factor defines the time that is required to access a specific data block on a rotating device. Small improvements likely will occur in the future. However, a significant step, such as doubling the RPM (if technically even possible), inevitably has an associated increase in power usage and price that is an inhibitor.

Flash drive solution

Flash drives can provide a solution for this dilemma, and no rotating parts means improved robustness and lower power usage. A remarkable improvement in I/O performance and a massive reduction in the average I/O response times (latency) are the compelling reasons to use flash drives in today’s storage subsystems.

Enterprise-class flash drives typically deliver 500,000 read and 300,000 write IOPS with typical latencies of 50 µs for reads and 800 µs for writes. Their form factors of 4.57 cm
(1.8 inches) / 6.35 cm (2.5 inches) / 8.89 cm (3.5 inches) and their interfaces (FC / SAS / Serial Advanced Technology Attachment (SATA)) make them easy to integrate into existing disk shelves. The IOPS metrics improve when flash drives are consolidated in storage arrays (flash arrays). In this case, the read and write IOPS are seen in millions for specific 4 KB data blocks.

Flash-drive market

The flash drive storage market is rapidly evolving. The key differentiator among today’s flash drive products is not the storage medium but the logic in the disk internal controllers. The top priorities in today’s controller development are optimally handling what is referred to as wear-out leveling, which defines the controller’s capability to ensure a device’s durability, and closing the remarkable gap between read and write I/O performance. Today’s flash drive technology is only the first step into the world of high-performance persistent semiconductor storage.

IBM FlashCore Module and NVMe drives

Figure 1-13 shows an IBM FlashCore® Module (FCM) (NVMe) with a capacity of 19.2 TB that is built by using 64-layer TLC flash memory and an Everspin MRAM cache into a U.2 form factor.

Figure 1-13 FlashCore Module (NVMe)

FCM modules (NVMe) are designed for high parallelism and optimized for 3D TLC and updated FPGAs. IBM also enhanced the FCM modules by adding a read cache to reduce latency on highly compressed pages, and four-plane programming to lower the overall power during writes. FCM modules offer hardware-assisted compression up to 3:1, and are FIPS 140-2 compliant.

FCM modules carry the IBM patented Variable Stripe RAID at the FCM level, and use distributed RAID (DRAID) to protect data at the system level. VSR and DRAID together optimize RAID rebuilds by offloading rebuilds to DRAID and offers protection against FCM failures.

FCMs are available in the following capacity sizes: 4.8 TB, 9.6 TB, 19.2 TB, or 38.4 TB.

Note: At the time of this writing, FCM drives are available on the IBM FlashSystem family only. IBM SAN Volume Controller SV2 and SA2 do not support any internal drive types.

Storage-class memory

SCM promises a massive improvement in performance (in IOPS), and a real density, cost, and energy efficiency compared to today’s flash-drive technology. IBM Research® is actively engaged in these new technologies.

For more information about nanoscale devices, see this web page.

For a comprehensive overview of the flash drive technology in a subset of the well-known Storage Networking Industry Association (SNIA) Technical Tutorials, see this web page.

When these technologies become a reality, they will fundamentally change the architecture of today’s storage infrastructures. Figure 1-14 shows the different types of storage technologies versus the latency.

Figure 1-14 Storage technologies versus latency for Intel SCM drives

1.5.6 MDisks

The SAN Volume Controller system and its I/O groups view the storage that is presented to them by the back-end storage system as several disks or LUNs, which are known as MDisks. Because IBM SAN Volume Controller does not attempt to provide recovery from physical disk failures within the back-end storage system, an MDisk often is provisioned from a RAID array.

These MDisks are placed into storage pools where they are divided into several extents. The application servers do not “see” the MDisks at all. Rather, they see logical disks, which are known as VDisks or volumes. These disks are presented by the IBM SAN Volume Controller I/O groups through the SAN or LAN to the servers.

For information about the system limits and restrictions, search for “Configuration Limits and Restrictions for IBM System Storage IBM SAN Volume Controller” at this web page.

When an MDisk is presented to the IBM SAN Volume Controller, it can be one of the following statuses:

•Unmanaged MDisk

An MDisk is reported as unmanaged when it is not a member of any storage pool. An unmanaged MDisk is not associated with any volumes and has no metadata that is stored on it. IBM SAN Volume Controller does not write to an MDisk that is in unmanaged mode except when it attempts to change the mode of the MDisk to one of the other modes. IBM SAN Volume Controller can see the resource, but the resource is not assigned to a storage pool.

•Managed MDisk

Managed mode MDisks are always members of a storage pool, and they contribute extents to the storage pool. Volumes (if not operated in image mode) are created from these extents. MDisks that are operating in managed mode might have metadata extents that are allocated from them and can be used as quorum disks. This mode is the most common and normal mode for an MDisk.

•Image mode MDisk

Image mode provides a direct block-for-block conversion from the MDisk to the volume by using virtualization. This mode is provided to satisfy the following major usage scenarios:

– Image mode enables the virtualization of MDisks that contain data that was written directly and not through IBM SAN Volume Controller. Rather, it was created by a direct-connected host.

This mode enables a client to insert IBM SAN Volume Controller into the data path of a storage volume or LUN with minimal downtime. For more information about the data migration process, see Chapter 8, “Storage migration” on page 465.

Image mode enables a volume that is managed by an IBM SAN Volume Controller to be used with the native copy services function that is provided by the underlying RAID controller. To avoid the loss of data integrity when the IBM SAN Volume Controller is used in this way, it is important that you disable the IBM SAN Volume Controller cache for the volume.

– The IBM SAN Volume Controller can migrate to image mode, which enables the IBM SAN Volume Controller to export volumes and access them directly from a host without the IBM SAN Volume Controller in the path.

Each MDisk that is presented from an external disk controller has an online path count that is the number of nodes that has access to that MDisk. The maximum count is the maximum number of paths that is detected at any point by the system. The current count is what the system sees now. A current value that is less than the maximum can indicate that SAN fabric paths were lost.

Tier

It is likely that the MDisks (LUNs) that are presented to the IBM SAN Volume Controller system have different characteristics because of the disk or technology type on which they are placed. The following tier options are available:

•tier0_flash

•tier1_flash

•tier_enterprise

•tier_nearline

•tier_scm

The default value for a newly discovered unmanaged MDisk is enterprise. You can change this value by running the chmdisk command.

The tier of external managed disks is not detected automatically and is set to enterprise. If the external managed disk is made up of flash drives or nearline Serial Attached SCSI (SAS) drives and you want to use Easy Tier, you must specify the tier when adding the managed disk to the storage pool or run the chmdisk command to modify the tier attribute.

1.5.7 Cache

The primary benefit of storage cache is to improve I/O response time. Reads and writes to a magnetic disk drive experience seek time and latency time at the drive level, which can result in 1 ms - 10 ms of response time (for an enterprise-class disk).

•The IBM SAN Volume Controller Model SV1 features 64 GB of memory with options for 256 GB of memory in a 2U 19-inch rack mount enclosure.

•The IBM SAN Volume Controller Model SA2 and SV2 features 128 GB of memory with options for 768 GB of memory in a 2U 19-inch rack mount enclosure.

The IBM SAN Volume Controller provides a flexible cache model. The node’s memory can be used as read or write cache.

Cache is allocated in 4 kibibyte (KiB) segments. A segment holds part of one track. A track is the unit of locking and destaging granularity in the cache. The cache virtual track size is
32 KiB (eight segments). A track might be only partially populated with valid pages. The
IBM SAN Volume Controller combines writes up to a 256 KiB track size if the writes are in the same tracks before destaging. For example, if 4 KiB is written into a track, another 4 KiB is written to another location in the same track.

Therefore, the blocks that are written from the IBM SAN Volume Controller to the disk subsystem can be any size of 512 bytes - 256 KiB. The large cache and advanced cache management algorithms enable it to improve the performance of many types of underlying disk technologies.

The IBM SAN Volume Controller capability to manage in the background the destaging operations that are incurred by writes (in addition to still supporting full data integrity) assists with the IBM SAN Volume Controller capability in achieving good performance.

The cache is separated into two layers: Upper cache and lower cache. Figure 1-15 shows the separation of the upper and lower cache.

Figure 1-15 Separation of upper and lower cache

The upper cache delivers the following functions, which enable the IBM SAN Volume Controller to streamline data write performance:

•Provides fast write response times to the host by being as high up in the I/O stack as possible.

•Provides partitioning.

The lower cache delivers the following additional functions:

•Ensures that the write cache between two nodes is in sync.

•Caches partitioning to ensure that a slow back-end cannot use the entire cache.

•Uses a destaging algorithm that adapts to the amount of data and the back-end performance.

•Provides read caching and prefetching.

Combined, the two levels of cache also deliver the following functions:

•Pins data when the LUN goes offline.

•Provides enhanced statistics for IBM Spectrum Control and IBM Storage Insights.

•Provides trace for debugging.

•Reports medium errors.

•Resynchronizes cache correctly and provides the atomic write function.

•Ensures that other partitions continue operation when one partition becomes 100% full of pinned data.

•Supports fast-write (two-way and one-way), flush-through, and write-through.

•Integrates with T3 recovery procedures.

•Supports two-way operation.

•Supports none, read-only, and read/write as user-exposed caching policies.

•Supports flush-when-idle.

•Supports expanding cache as more memory becomes available to the platform.

•Supports credit throttling to avoid I/O skew and offer fairness/balanced I/O between the two nodes of the I/O group.

•Enables switching of the preferred node without needing to move volumes between I/O groups.

Depending on the size, age, and technology level of the disk storage system, the total available cache in the IBM SAN Volume Controller nodes can be larger, smaller, or about the same as the cache that is associated with the disk storage.

Because hits to the cache can occur in the IBM SAN Volume Controller or the back-end storage system level of the overall system, the system as a whole can take advantage of the larger amount of cache wherever the cache is available.

In addition, regardless of their relative capacities, both levels of cache tend to play an important role in enabling sequentially organized data to flow smoothly through the system. The IBM SAN Volume Controller cannot increase the throughput potential of the underlying disks in all cases because this increase depends on the underlying storage technology and the degree to which the workload exhibits hotspots or sensitivity to cache size or cache algorithms.

However, the write cache is still assigned to a maximum of 12 GB and compression cache to a maximum of 34 GB. The remaining installed cache is used as read cache (including allocation for features like FlashCopy, GM, or MM). Data Reduction Pools shares memory with the main I/O process.

1.5.8 Quorum disk

A quorum disk is an MDisk or a managed drive that contains a reserved area that is used exclusively for system management. A system automatically assigns quorum disk candidates. Quorum disks are used when there is a problem in the SAN fabric or when nodes are shut down, which leaves half of the nodes remaining in the system. This type of problem causes a loss of communication between the nodes that remain in the system and those that do not remain.

The nodes are split into groups where the remaining nodes in each group can communicate with each other, but not with the other group of nodes that were formerly part of the system. In this situation, some nodes must stop operating and processing I/O requests from hosts to preserve data integrity while maintaining data access. If a group contains less than half the nodes that were active in the system, the nodes in that group stop operating and processing I/O requests from hosts.

It is possible for a system to split into two groups with each group containing half the original number of nodes in the system. A quorum disk determines which group of nodes stops operating and processing I/O requests. In this tiebreaker situation, the first group of nodes that accesses the quorum disk is marked as the owner of the quorum disk. As a result, the owner continues to operate as the system and handles all I/O requests.

If the other group of nodes cannot access the quorum disk or discover that the quorum disk is owned by another group of nodes, it stops operating as the system and does not handle I/O requests. A system can have only one active quorum disk that is used for a tiebreaker situation. However, the system uses three quorum disks to record a backup of the system configuration data that is used if there is a disaster. The system automatically selects one active quorum disk from these three disks.

The other quorum disk candidates provide redundancy if the active quorum disk fails before a system is partitioned. To avoid the possibility of losing all of the quorum disk candidates with a single failure, assign quorum disk candidates on multiple storage systems.

Quorum disk requirements: To be considered eligible as a quorum disk, a LUN must meet the following criteria:

•It is presented by a storage system that supports IBM SAN Volume Controller quorum disks.

•It is manually enabled as a quorum disk candidate by running the chcontroller -allowquorum yes command.

•It is in managed mode (no image mode).

•It includes sufficient free extents to hold the system state information and the stored configuration metadata.

•It is visible to all of the nodes in the system.

If possible, the IBM SAN Volume Controller places the quorum candidates on separate storage systems. However, after the quorum disk is selected, no attempt is made to ensure that the other quorum candidates are presented through separate storage systems.

Quorum disk placement verification and adjustment to separate storage systems (if possible) reduce the dependency from a single storage system, and can increase the quorum disk availability.

You can list the quorum disk candidates and the active quorum disk in a system by running the lsquorum command.

When the set of quorum disk candidates is chosen, it is fixed. However, a new quorum disk candidate can be chosen in one of the following conditions:

•When the administrator requests that a specific MDisk becomes a quorum disk by running the chquorum command.

•When an MDisk that is a quorum disk is deleted from a storage pool.

•When an MDisk that is a quorum disk changes to image mode.

An offline MDisk is not replaced as a quorum disk candidate.

For DR purposes, a system must be regarded as a single entity so that the system and the quorum disk can be collocated.

Special considerations are required for the placement of the active quorum disk for a stretched, split cluster or split I/O group configurations. For more information, see IBM Documentation.

Important: Running an IBM SAN Volume Controller system without a quorum disk can seriously affect your operation. A lack of available quorum disks for storing metadata prevents any migration operation.

Mirrored volumes can be taken offline if no quorum disk is available. This behavior occurs because the synchronization status for mirrored volumes is recorded on the quorum disk.

During the normal operation of the system, the nodes communicate with each other. If a node is idle for a few seconds, a heartbeat signal is sent to ensure connectivity with the system. If a node fails for any reason, the workload that is intended for the node is taken over by another node until the failed node is restarted and readmitted into the system (which happens automatically).

If the Licensed Internal Code on a node becomes corrupted, which results in a failure, the workload is transferred to another node. The code on the failed node is repaired, and the node is readmitted into the system (which is an automatic process).

IP quorum configuration

In a stretched configuration or HyperSwap configuration, you must use a third, independent site to house quorum devices. To use a quorum disk as the quorum device, this third site must use FC or IP connectivity together with an external storage system. In a local environment, no extra hardware or networking, such as FC or SAS-attached storage, is required beyond what is normally always provisioned within a system.

To use an IP-based quorum application as the quorum device for the third site, no FC connectivity is used. Java applications are run on hosts at the third site. However, there are strict requirements on the IP network, and some disadvantages with using IP quorum applications.

Unlike quorum disks, all IP quorum applications must be reconfigured and redeployed to hosts when certain aspects of the system configuration change. These aspects include adding or removing a node from the system, or when node service IP addresses are changed.

For stable quorum resolutions, an IP network must provide the following requirements:

•Connectivity from the hosts to the service IP addresses of all nodes. If IP quorum is configured incorrectly, the network must also deal with the possible security implications of exposing the service IP addresses because this connectivity can also be used to access the service GUI.

•Port 1260 is used by IP quorum applications to communicate from the hosts to all nodes.

•The maximum round-trip delay must not exceed 80 ms, which means 40 ms each direction.

•A minimum bandwidth of 2 MBps for node-to-quorum traffic.

Even with IP quorum applications at the third site, quorum disks at site one and site two are required because they are used to store metadata. To provide quorum resolution, run the mkquorumapp command or use the GUI in Settings → Systems → IP Quorum to generate a Java application that is then copied to and run on a host at a third site. The maximum number of applications that can be deployed is five.

1.5.9 Storage pool

A storage pool is a collection of MDisks that provides the pool of storage from which volumes are provisioned. A single system can manage up to 1024 storage pools. The size of these pools can be changed (expanded or shrunk) at run time by adding or removing MDisks without taking the storage pool or the volumes offline.

At any point, an MDisk can be a member in one storage pool only, except for image mode volumes.

Figure 1-16 shows the relationships of the IBM SAN Volume Controller entities to each other.

Figure 1-16 Overview of an IBM SAN Volume Controller clustered system with an I/O group

Each MDisk capacity in the storage pool is divided into several extents. The size of the extent is selected by the administrator when the storage pool is created and cannot be changed later. The size of the extent is 16 MiB - 8192 MiB.

It is a best practice to use the same extent size for all storage pools in a system. This approach is a prerequisite for supporting volume migration between two storage pools. If the storage pool extent sizes are not the same, you must use volume mirroring to copy volumes between pools.

The IBM SAN Volume Controller limits the number of extents in a system to 222=~4 million. Because the number of addressable extents is limited, the total capacity of an IBM SAN Volume Controller system depends on the extent size that is chosen by the IBM SAN Volume Controller administrator.

1.5.10 Volumes

Volumes are logical disks that are presented to the host or application servers by the IBM SAN Volume Controller.

The following types of volumes are available in terms of extents management:

•Striped

A striped volume is allocated one extent in turn from each MDisk in the storage pool. This process continues until the space that is required for the volume is satisfied.

It is also possible to supply a list of MDisks to use.

Figure 1-17 shows how a striped volume is allocated, assuming that 10 extents are required.

Figure 1-17 Striped volume

•Sequential

A sequential volume is where the extents are allocated sequentially from one MDisk to the next MDisk (see Figure 1-18).

Figure 1-18 Sequential volume

•Image mode

Image mode volumes (see Figure 1-19 on page 32) are special volumes that have a direct relationship with one MDisk. The most common use case of image volumes is a data migration from your old (typically non-virtualized) storage to the IBM SAN Volume Controller based virtualized infrastructure.

Figure 1-19 Image mode volume

When the image mode volume is created, a direct mapping is made between extents that are on the MDisk and the extents that are on the volume. The LBA x on the MDisk is the same as the LBA x on the volume, which ensures that the data on the MDisk is preserved as it is brought into the clustered system.

Some virtualization functions are not available for image mode volumes, so it is useful to migrate the volume into a new storage pool. After the migration completion, the MDisk becomes a managed MDisk.

If you add an MDisk containing any historical data to a storage pool, all data on the MDisk is lost. Ensure that you create image mode volumes from MDisks that contain data before adding MDisks to the storage pools.

1.5.11 Easy Tier

Easy Tier is a performance function that automatically migrate extents off a volume to or from one MDisk storage tier to another MDisk storage tier. Since V7.3, the IBM SAN Volume Controller code can support a three-tier implementation.

Easy Tier monitors the host I/O activity and latency on the extents of all volumes with the Easy Tier function that is turned on in a multitier storage pool over a 24-hour period. Then, it creates an extent migration plan that is based on this activity, and then dynamically moves high-activity or hot extents to a higher disk tier within the storage pool. It also moves extents whose activity dropped off or cooled down from the high-tier MDisks back to a lower-tiered MDisk.

Easy Tier supports the new SCM drives with a new tier that is called tier_scm.

Turning on or off Easy Tier: The Easy Tier function can be turned on or off at the storage pool level and the volume level.

The automatic load-balancing function is enabled by default on each volume and cannot be turned off by using the GUI. This load-balancing feature is not considered an Easy Tier function, although it uses the same principles.

The management GUI supports monitoring Easy Tier data movement in graphical reports. The data in these reports helps you understand how Easy Tier manages data between the different tiers of storage, how tiers within pools are used, and the workloads among the different tiers. Charts for data movement, tier composition, and workload skew comparison can be downloaded as comma-separated value (CSV) files.

You can also offload the statistics file from the IBM SAN Volume Controller nodes and by using the IBM Storage Tier Advisor Tool (STAT) to create a summary report. STAT can be downloaded for no initial cost from this web page.

For more information about Easy Tier, see Chapter 9, “Advanced features for storage efficiency” on page 487.

1.5.12 Hosts

A host is a logical object that represents a list of worldwide port names (WWPNs), NVMe qualified names (NQNs), or iSCSI or iSER names that identify the interfaces that the host system uses to communicate with the IBM SAN Volume Controller. Fibre Channel connections or Fibre Channel over Ethernet use WWPNs to identify host interfaces to the system. iSCSI or iSER names can be iSCSI qualified names (IQNs) or extended unique identifiers (EUIs). NQNs are used to identify hosts that use FC-NVMe connections.

Volumes can be mapped to a host to enable access to a set of volumes.

Node failover can be handled without having a multipath driver that is installed on the iSCSI server. An iSCSI-attached server can reconnect after a node failover to the original target IP address, which is now presented by the partner node. To protect the server against link failures in the network or HBA failures, a multipath driver must be used.

N_Port ID Virtualization (NPIV) is a method for virtualizing a physical Fibre Channel port that is used for host I/O. When NPIV is enabled, ports do not come up until they are ready to service I/O, which improves host behavior around node unpends. In addition, path failures that occur because an offline node are masked from host multipathing.

Host cluster

A host cluster is a group of logical host objects that can be managed together. For example, you can create a volume mapping that is shared by every host in the host cluster. Host objects that represent hosts can be grouped in a host cluster and share access to volumes. New volumes can also be mapped to a host cluster, which simultaneously maps that volume to all hosts that are defined in the host cluster.

1.5.13 Array

An array is an ordered configuration, or group, of physical devices (drives) that is used to define logical volumes or devices. An array is a type of MDisk that is made up of disk drives; these drives are members of the array. A Redundant Array of Independent Disks (RAID) is a method of configuring member drives to create high availability (HA) and high-performance systems. The system supports nondistributed and distributed array configurations.

In nondistributed arrays, entire drives are defined as “hot-spare” drives. Hot-spare drives are idle and do not process I/O for the system until a drive failure occurs. When a member drive fails, the system automatically replaces the failed drive with a hot-spare drive. The system then resynchronizes the array to restore its redundancy. However, all member drives within a distributed array have a rebuild area that is reserved for drive failures. All the drives in an array can process I/O data and provide faster rebuild times when a drive fails. The RAID level provides different degrees of redundancy and performance; it also determines the number of members in the array.

1.5.14 Encryption

The IBM SAN Volume Controller provides optional encryption of data at rest, which protects against the potential exposure of sensitive user data and user metadata that is stored on discarded, lost, or stolen storage devices. Encryption of system data and system metadata is not required, so system data and metadata are not encrypted.

Planning for encryption involves purchasing a licensed function and then activating and enabling the function on the system.

To encrypt data that is stored on drives, the nodes capable of encryption must be licensed and configured to use encryption. When encryption is activated and enabled on the system, valid encryption keys must be present on the system when the system unlocks the drives or the user generates a new key.

In IBM Spectrum Virtualize V7.4, hardware encryption was introduced with the software encryption option that was introduced in Version 7.6. Encryption keys can be managed by an external key management system, such as the IBM Security™ Key Lifecycle Manager (SKLM), or stored on USB flash drives that are attached to a minimum of one of the nodes. Since Version 8.1, IBM Spectrum Virtualize provides a combination of external and USB key repositories.

SKLM is an IBM solution to provide the infrastructure and processes to locally create, distribute, backup, and manage the lifecycle of encryption keys and certificates. Before activating and enabling encryption, you must determine the method of accessing key information during times when the system requires an encryption key to be present.

When SKLM is used as a key manager for the IBM SAN Volume Controller encryption, you can run into a deadlock situation if the key servers are running on encrypted storage that is provided by the IBM SAN Volume Controller. To avoid a deadlock situation, ensure that the IBM SAN Volume Controller can communicate with an encryption server to get the unlock key after a power-on or restart scenario. Up to four SKLM servers are supported.

Data encryption is protected by the Advanced Encryption Standard (AES) algorithm that uses a 256-bit symmetric encryption key in XTS mode, as defined in the Institute of Electrical and Electronics Engineers (IEEE) 1619-2007 standard as XTS-AES-256. That data encryption key is itself protected by a 256-bit AES key wrap when stored in non-volatile form.

1.5.15 iSCSI and iSCSI Extensions over RDMA

iSCSI is an alternative means of attaching hosts and external storage controllers to the IBM SAN Volume Controller.

The iSCSI function is a software function that is provided by the IBM Spectrum Virtualize software, not hardware. In Version 7.7, IBM introduced software capabilities to enable the underlying virtualized storage to attach to IBM SAN Volume Controller by using the iSCSI protocol.

The iSCSI protocol enables the transport of SCSI commands and data over an IP network (TCP/IP), which is based on IP routers and Ethernet switches. iSCSI is a block-level protocol that encapsulates SCSI commands. Therefore, it uses an existing IP network rather than FC infrastructure.

The major functions of iSCSI include encapsulation and the reliable delivery of CDB transactions between initiators and targets through the IP network, especially over a potentially unreliable IP network.

Every iSCSI node in the network must have an iSCSI name and address:

•An iSCSI name is a location-independent, permanent identifier for an iSCSI node. An iSCSI node has one iSCSI name, which stays constant for the life of the node. The terms initiator name and target name also refer to an iSCSI name.

•An iSCSI address specifies the iSCSI name of an iSCSI node and a location of that node. The address consists of a host name or IP address, a TCP port number (for the target), and the iSCSI name of the node. An iSCSI node can have any number of addresses, which can change at any time, particularly if they are assigned by way of Dynamic Host Configuration Protocol (DHCP). An IBM SAN Volume Controller node represents an iSCSI node and provides statically allocated IP addresses.

IBM SAN Volume Controller models SV2 and SA2 supports 25 Gbps Ethernet adapters that provide both iSCSI and iSCSI Extensions over RDMA (iSER) connections.

iSER is a network protocol that extends the iSCSI protocol to use RDMA. You can implement RDMA-based connections that use Ethernet networking structures and connections without upgrading current hardware. Currently, the system supports RDMA-based connections with RDMA over Converged Ethernet (RoCE) or Internet-Wide Area RDMA Protocol (iWARP).

For host attachment, these 25 Gbps adapters support iSCSI and RDMA-based connections; however, for external storage systems, only iSCSI connections are supported through these adapters. When the 25 Gbps adapter is installed on nodes in the system, RDMA technology can be used for node-to-node communications.

1.5.16 IBM Real-time Compression

RtC is an attractive solution to address the increasing requirements for data storage, power, cooling, and floor space. When applied, RtC can save storage space so more data can be stored, and fewer storage enclosures are required to store a data set.

Note: IBM SAN Volume Controller models SV2 and SA2 do not support RtC software compression. They support only the newer DRP software compression.

RtC provides the following benefits:

•Compression for active primary data. RtC can be used with active primary data.

•Compression for replicated/mirrored data. Remote volume copies can be compressed in addition to the volumes at the primary storage tier. This process also reduces storage requirements in MM and GM destination volumes.

•No changes to the existing environment are required. RtC is part of the storage system.

•Overall savings in operational expenses. More data is stored and fewer storage expansion enclosures are required. Reducing rack space has the following benefits:

– Reduced power and cooling requirements. More data is stored in a system, requiring less power and cooling per gigabyte or used capacity.

– Reduced software licensing for more functions in the system. More data is stored per enclosure, which reduces the overall spending on licensing.

•Disk space savings are immediate. The space reduction occurs when the host writes the data. This process is unlike other compression solutions in which some or all the reduction is realized only after a post-process compression batch job is run.

When compression is applied, it is a best practice to monitor the overall performance and CPU utilization. Compression can be implemented without any impact to the environment, and it can be used with storage processes running.

1.5.17 Data Reduction Pools

Data Reduction Pools (DRP) represent a significant enhancement to the storage pool concept. The reason is that the virtualization layer is primarily a simple layer that runs the task of lookups between virtual and physical extents.

DRP is a new type of storage pool, implementing techniques such as thin-provisioning, compression, and deduplication to reduce the amount of physical capacity that is required to store data. Savings in storage capacity requirements translate into the reduction of the cost of storing the data.

With DRPs, you can automatically de-allocate and reclaim the capacity of thin-provisioned volumes that contain deleted data and enable this reclaimed capacity to be reused by other volumes. Data reduction provides more performance from compressed volumes because of the implementation of the new log structured array.

Deduplication

Data deduplication is one of the methods of reducing storage needs by eliminating redundant copies of data. Data reduction is a way to decrease the storage disk infrastructure that is required, optimize the usage of existing storage disks, and improve data recovery infrastructure efficiency. Existing data or new data is standardized into chunks that are examined for redundancy. If data duplicates are detected, then pointers are shifted to reference a single copy of the chunk, and the duplicate data sets are then released.

To estimate potential capacity savings that data reduction can provide on the system, use the Data Reduction Estimation Tool (DRET). The tool scans target workloads on all attached storage arrays, consolidates these results, and generates an estimate of potential data reduction savings for the entire system.

DRET is available at IBM Fix Central.

1.5.18 IP replication

IP replication was introduced in Version 7.2, and it enables data replication between
IBM Spectrum Virtualize family members. IP replication uses the IP-based ports of the cluster nodes.

The IP replication function is transparent to servers and applications like traditional FC-based mirroring is. All remote mirroring modes (MM, GM, and GMCV) are supported.

The configuration of the system is straightforward, and IBM Spectrum Virtualize family systems normally “find” each other in the network and can be selected from the GUI.

IP replication includes Bridgeworks SANSlide network optimization technology, and it is available at no additional charge. Remember, remote mirror is a chargeable option, but the price does not change with IP replication. Existing remote mirror users have access to the function at no additional charge.

IP connections that are used for replication can have long latency (the time to transmit a signal from one end to the other), which can be caused by distance or by many “hops” between switches and other appliances in the network. Traditional replication solutions transmit data, wait for a response, and then transmit more data, which can result in network utilization as low as 20% (based on IBM measurements). In addition, this scenario gets worse the longer the latency.

Bridgeworks SANSlide technology, which is integrated with the IBM Spectrum Virtualize family, requires no separate appliances and incurs no extra cost and configuration steps. It uses artificial intelligence (AI) technology to transmit multiple data streams in parallel, adjusting automatically to changing network environments and workloads.

SANSlide improves network bandwidth utilization up to 3x. Therefore, customers can deploy a less costly network infrastructure, or take advantage of faster data transfer to speed replication cycles, improve remote data currency, and enjoy faster recovery.

1.5.19 IBM Spectrum Virtualize copy services

IBM Spectrum Virtualize supports the following copy services functions:

•Remote copy (synchronous or asynchronous)

•FlashCopy (PiT copy) and Transparent Cloud Tiering (TCT)

•HyperSwap

Copy services functions are implemented within a single IBM SAN Volume Controller, or between multiple members of the IBM Spectrum Virtualize family.

The copy services layer sits above and operates independently of the function or characteristics of the underlying disk subsystems that are used to provide storage resources to an IBM SAN Volume Controller.

1.5.20 Synchronous or asynchronous Remote Copy

The general application of remote copy seeks to maintain two copies of data. Often, the two copies are separated by distance, but not always. The remote copy can be maintained in synchronous or asynchronous modes. IBM Spectrum Virtualize, Metro Mirror, and Global Mirror are the IBM branded terms for the functions that are synchronous remote copy and asynchronous remote copy.

Synchronous remote copy ensures that updates are committed at both the primary and the secondary volumes before the application considers the updates complete. Therefore, the secondary volume is fully up to date if it is needed in a failover. However, the application is fully exposed to the latency and bandwidth limitations of the communication link to the secondary volume. In a truly remote situation, this extra latency can have a significant adverse effect on application performance.

Special configuration guidelines exist for SAN fabrics and IP networks that are used for data replication. Consider the distance and available bandwidth of the intersite links.

A function of Global Mirror for low bandwidth was introduced in IBM Spectrum Virtualize 6.3. It uses change volumes that are associated with the primary and secondary volumes. These point in time copies are used to record changes to the remote copy volume, the FlashCopy map that exists between the secondary volume and the change volume, and between the primary volume and the change volume. This function is called Global Mirror with change volumes (cycling mode).

Figure 1-20 shows an example of this function where you can see the relationship between volumes and change volumes.

Figure 1-20 Global Mirror with change volumes

In asynchronous remote copy, the application acknowledges that the write is complete before the write is committed at the secondary volume. Therefore, on a failover, specific updates (data) might be missing at the secondary volume. The application must have an external mechanism for recovering the missing updates, if possible. This mechanism can involve user intervention. Recovery on the secondary site involves starting the application on this recent backup, and then rolling forward or backward to the most recent commit point.

1.5.21 FlashCopy and Transparent Cloud Tiering

FlashCopy and Transparent Cloud Tiering (TCT) are used to make a copy of a source volume on a target volume. After the copy operation starts, the original content of the target volume is lost, and the target volume has the contents of the source volume as they existed at a single Point in Time (PiT). Although the copy operation takes time, the resulting data at the target appears as though the copy was made instantaneously.

FlashCopy

FlashCopy is sometimes described as an instance of a time-zero (T0) copy or a Point in Time (PiT) copy technology.

FlashCopy can be performed on multiple source and target volumes. FlashCopy enables management operations to be coordinated so that a common single PiT is chosen for copying target volumes from their respective source volumes.

With IBM Spectrum Virtualize, multiple target volumes can undergo FlashCopy from the same source volume. This capability can be used to create images from separate PiTs for the source volume, and to create multiple images from a source volume at a common PiT.

Reverse FlashCopy enables target volumes to become restore points for the source volume without breaking the FlashCopy relationship, and without waiting for the original copy operation to complete. IBM Spectrum Virtualize supports multiple targets and multiple rollback points.

Most clients aim to integrate the FlashCopy feature for PiT copies and quick recovery of their applications and databases. An IBM solution for this goal is provided by IBM Spectrum Protect, which is described at the following website:

Data Protection and Recovery

Transparent Cloud Tiering

IBM Spectrum Virtualize Transparent Cloud Tiering (TCT) is an alternative solution for data protection, backup, and restore that interfaces to Cloud Services Providers (CSPs), such as IBM Cloud®. The TCT function helps organizations to reduce costs that are related to power and cooling when offsite data protection is required to send sensitive data out of the main site.

TCT uses IBM FlashCopy techniques that provide full and incremental snapshots of several volumes. Snapshots are encrypted and compressed before being uploaded to the cloud. Reverse operations are also supported within that function. When a set of data is transferred out to cloud, the volume snapshot is stored as object storage.

IBM Cloud Object Storage uses innovative approach and a cost-effective solution to store a large amount of unstructured data, and delivers mechanisms to provide security services, HA, and reliability.

The management GUI provides an easy-to-use initial setup, advanced security settings, and audit logs that records all backup and restore to cloud.

To learn more about IBM Cloud Object Storage, see this web page.

HyperSwap

The IBM HyperSwap function is an HA feature that provides dual-site access to a volume. When you configure a system with a HyperSwap topology, the system configuration is split between two sites for data recovery, migration, or HA use cases.

When a HyperSwap topology is configured, each node, external storage system, and host in the system configuration must be assigned to one of the sites in the topology. Both nodes of an I/O group must be at the same site. This site must be the same site as the external storage systems that provide the managed disks to that I/O group. When managed disks are added to storage pools, their site attributes must match. This requirement ensures that each copy in a HyperSwap volume is fully independent and is at a distinct site.

When the system is configured between two sites, HyperSwap volumes have a copy at one site and a copy at another site. Data that is written to the volume is automatically sent to both copies. If one site is no longer available, the other site can provide access to the volume. If you are using ownership groups to manage access to HyperSwap volumes, both volume copies and users who access them must be assigned to the same ownership group.

A 2-site HyperSwap configuration can be extended to a third site for DR that uses the IBM Spectrum Virtualize 3-Site Orchestrator.

1.6 Business continuity

In simple terms, a clustered system or system is a collection of servers that together provide a set of resources to a client. The key point is that the client has no knowledge of the underlying physical hardware of the system. The client is isolated and protected from changes to the physical hardware. This arrangement offers many benefits including, most significantly,
HA.

Resources on the clustered system act as HA versions of unclustered resources. If a node (an individual computer) in the system is unavailable or too busy to respond to a request for a resource, the request is passed transparently to another node that can process the request. The clients are “unaware” of the exact locations of the resources that they use.

The IBM SAN Volume Controller is a collection of up to eight nodes, which are added in pairs that are known as I/O groups. These nodes are managed as a set (system), and they present a single point of control to the administrator for configuration and service activity.

The eight-node limit for an IBM SAN Volume Controller system is a limitation that is imposed by the Licensed Internal Code, and not a limit of the underlying architecture. Larger system configurations might be available in the future.

Although the IBM SAN Volume Controller code is based on a purpose-optimized Linux kernel, the clustered system feature is not based on Linux clustering code. The clustered system software within the IBM SAN Volume Controller (that is, the event manager cluster framework) is based on the outcome of the COMPASS research project. It is the key element that isolates the IBM SAN Volume Controller application from the underlying hardware nodes.

The clustered system software makes the code portable. It provides the means to keep the single instances of the IBM SAN Volume Controller code that are running on separate systems’ nodes in sync. Therefore, restarting nodes during a code upgrade, adding nodes, removing nodes from a system, or failing nodes cannot affect IBM SAN Volume Controller availability.

All active nodes of a system must know that they are members of the system. This knowledge is especially important in situations where it is key to have a solid mechanism to decide which nodes form the active system, such as the split-brain scenario where single nodes lose contact with other nodes. A worst case scenario is a system that splits into two separate systems.

Within a IBM SAN Volume Controller system, the voting set and a quorum disk are responsible for the integrity of the system. If nodes are added to a system, they are added to the voting set. If nodes are removed, they are removed quickly from the voting set. Over time, the voting set and the nodes in the system can change so that the system migrates onto a separate set of nodes from the set on which it started.

The IBM SAN Volume Controller clustered system implements a dynamic quorum. Following a loss of nodes, if the system can continue to operate, it adjusts the quorum requirement so that further node failure can be tolerated.

The lowest Node Unique ID in a system becomes the boss node for the group of nodes. It determines (from the quorum rules) whether the nodes can operate as the system. This node also presents the maximum two-cluster IP addresses on one or both of its nodes’ Ethernet ports to enable access for system management.

1.6.1 Business continuity with Stretched Clusters

Within standard implementations of the IBM SAN Volume Controller, all the I/O group nodes are physically installed in the same location. To supply the different HA needs that customers have, the stretched system configuration was introduced. In this configuration, each node (from the same I/O group) on the system is physically on a different site. When implemented with mirroring technologies, such as volume mirroring or copy services, these configurations can be used to maintain access to data on the system if there are power failures or site-wide outages.

Stretched Clusters are considered HA solutions because both sites work as instances of the production environment (there is no standby location). Combined with application and infrastructure layers of redundancy, Stretched Clusters can provide enough protection for data that requires availability and resiliency.

In Stretched Clusters configuration, nodes within an I/O group can be separated by a distance of up to 10 km (6.2 miles) by using specific configurations. You can use FC inter switch links (ISLs) in paths between nodes of the same I/O group. In this case, nodes can be separated by a distance of up to 300 km (186.4 miles); however, potential performance impacts can result.

1.6.2 Business continuity with Enhanced Stretched Cluster

Enhanced Stretched Cluster (ESC) further improves Stretched Cluster configurations with the site awareness concept for nodes, hosts, and external storage systems. It also provides a feature that enables you to manage effectively rolling disaster scenarios.

The site awareness concept enables more efficiency for host I/O traffic through the SAN, and an easier host path management.

The use of an IP-based quorum application as the quorum device for the third site, no FC connectivity is required. Java applications run on hosts at the third site.

Note: Stretched Cluster and ESC features are supported for IBM SAN Volume Controller only. They are not supported for the IBM FlashSystem family of products.

For more information details and implementation guidelines about deploying Stretched Cluster or ESC, see IBM Spectrum Virtualize and SAN Volume Controller Enhanced Stretched Cluster with VMware, SG24-8211.

1.6.3 Business continuity with HyperSwap

The HyperSwap HA feature in the IBM Spectrum Virtualize software enables business continuity during hardware failure, power failure, connectivity failure, or disasters, such as fire or flooding. The HyperSwap feature is available on the IBM SAN Volume Controller and IBM FlashSystem products that are running IBM Spectrum Virtualize software.

The HyperSwap feature provides HA volumes that are accessible through two sites at up to 300 km apart. A fully independent copy of the data is maintained at each site. When data is written by hosts at either site, both copies are synchronously updated before the write operation is completed. The HyperSwap feature automatically optimizes itself to minimize data that is transmitted between sites and to minimize host read and write latency.

HyperSwap includes the following key features:

•Works with IBM SAN Volume Controller and IBM FlashSystem products that are running IBM Spectrum Virtualize software.

•Uses intra-cluster synchronous remote copy (MM) capabilities along with change volume and access I/O group technologies.

•Makes a host’s volumes accessible across two I/O groups in a clustered system by using the MM relationship in the background. They look like a single volume to the host.

•Works with the standard multipathing drivers that are available on various host types, with no extra host support that is required to access the HA volume.

For more information about HyperSwap implementation use cases and guidelines, see the following publications:

•IBM Storwize V7000, Spectrum Virtualize, HyperSwap, and VMware Implementation, SG24-8317

•High Availability for Oracle Database with IBM PowerHA SystemMirror and IBM Spectrum Virtualize HyperSwap, REDP-5459

•IBM Spectrum Virtualize HyperSwap SAN Implementation and Design Best Practices, REDP-5597

1.6.4 Business continuity with three-site replication

Solutions were available for some time by using the IBM SAN Volume Controller stretched cluster to provide HA across two sites combined with Global Mirror or Global Mirror with Change Volumes to replicate to a third site. However this type of implementation requires manual intervention and custom scripts that increases management complexity.

Another possibility to implement a three-site replication solution was made available in limited deployments on code version 8.3.1 where data is replicated from the primary site to two alternative sites and in addition, the remaining two sites are aware of the difference between themselves. This configuration ensures that if a disaster occurs at any one of the sites, the remaining two sites can establish a consistent_synchronized remote copy relationship among themselves with minimal data transfer; that is, within the expected RPO.

Spectrum Virtualize version 8.4 expands the three-site replication model to include HyperSwap, which improves data availability options in three-site implementations. Systems that are configured in a three-site topology have high DR capabilities, but a disaster might take the data offline until the system can be failed over to an alternate site. HyperSwap allows active-active configurations to maintain data availability, eliminating the need to failover if communications should be disrupted. This provides a more robust environment, allowing up to 100% uptime for data, and recovery options inherent to DR solutions.

To better assist with three-site replication solutions, IBM Spectrum Virtualize 3-Site Orchestrator coordinates replication of data for DR and HA scenarios between systems.

IBM Spectrum Virtualize 3-Site Orchestrator is a command-line based application that runs on a separate Linux host that configures and manages supported replication configurations on IBM Spectrum Virtualize products.

Figure 1-21 shows the two supported topologies for the three-site replication co-ordinated solutions.

Figure 1-21 “Star” and “Cascade” modes in a three-site solution

1.6.5 Automatic hot spare nodes

In previous stages of IBM SAN Volume Controller development, the scripted warm standby procedure enables administrators to configure spare nodes in a cluster by using the concurrent hardware upgrade capability of transferring WWPNs between nodes. Starting in version 8.2, the system can automatically take on the spare node to replace a failed node in a cluster or to keep the whole system under maintenance tasks, such as software upgrades. These extra nodes are called hot spare nodes.

Up to four nodes can be added to a single cluster and when the hot-spare node is used to replace a node, the system attempts to find a spare node that matches the configuration of the replaced node perfectly. However, if a perfect match does not exist, the system continues the configuration check until a matching criteria is found. The following criteria is used by the system to determine suitable hot-spare nodes:

•Criteria that requires an exact match

– Memory capacity

– Fibre Channel port ID

– Compression support

– Site

•Criteria that is recommended to match, but can be different

– Hardware type

– CPU count

– Number of Fibre Channel ports

If the criteria are not the same for both, the system uses lower criteria until the minimal configuration is found. For example, if the Fibre Channel ports do not match exactly but all the other required criteria match, the hot-spare node can still be used. The minimal configuration that the system can use as a hot-spare node includes identical memory, site, Fibre Channel port ID, and, if applicable, compression settings.

If the nodes on the system support and are licensed to use encryption, the hot-spare node must also support and be licensed to use encryption.

The hot spare node essentially becomes another node in the cluster, but is not doing anything under normal conditions. Only when it is needed does it use the N_Port ID Virtualization (NPIV) feature of the Spectrum Virtualize virtualized storage ports to take over the job of the failed node by moving the NPIV WWPNs from the failed node first to the surviving partner node in the I/O group and then over to the hot spare node.

Approximately 1 minute passes intentionally before a cluster swaps in a node to avoid any thrashing when a node fails. In addition, the system must be sure that the node definitely failed, and is not (for example) restarting. The cache flushes while only one node is in the I/O group, the full cache is returned when the spare swaps in.

This entire process is transparent to the applications; however, the host systems notice a momentary path lost for each transition. The persistence of the NPIV WWPNs lessens the multipathing effort on the host considerably during path recovery.

Note: A warm start of active node (code assert or restart) does not cause the hot spare to swap in because the restarted node becomes available within 1 minute.

The other use case for hot spare nodes is during a software upgrade. Normally, the only impact during an upgrade is slightly degraded performance. While the node that is upgrading is down, the partner in the I/O group writes through cache and handles both nodes’ workload. Therefore, to work around this issue, the cluster uses a spare in place of the node that is upgrading. The cache does not need to go into write-through mode and the period of degraded performance from running off a single node in the I/O group is significantly reduced.

After the upgraded node returns, it is swapped back so that you roll through the nodes as normal, but without any failover and failback at the multipathing layer. This process is handled by the NPIV ports, so the upgrades should be seamless for administrators working in large enterprise IBM SAN Volume Controller deployments.

Note: After the cluster commits new code, it also automatically upgrades hot spares to match the cluster code level.

This feature is available to IBM SAN Volume Controller only. Although FlashSystem systems can use NPIV and realize the general failover benefits, no hot spare canister or split I/O group option is available for the enclosure-based systems.

1.7 Management and support tools

The IBM Spectrum Virtualize system can be managed through the included management software that runs on the IBM SAN Volume Controller hardware.

1.7.1 IBM Assist On-site and Remote Support Assistance

With the IBM Assist On-site tool, a member of the IBM Support team can view your desktop and share control of your server to provide you with a solution. This tool is a remote desktop-sharing solution that is offered through the IBM website. With it, the IBM System Services Representative (IBM SSR) can remotely view your system to troubleshoot a problem.

You can maintain a chat session with the IBM SSR so that you can monitor this activity and either understand how to fix the problem yourself or enable them to fix it for you.

For more information, see IBM remote assistance: Assist On-site.

When you access the website, you sign in and enter a code that the IBM SSR provides to you. This code is unique to each IBM Assist On-site session. A plug-in is downloaded to connect you and your IBM SSR to the remote service session. The IBM Assist On-site tool contains several layers of security to protect your applications and your computers. The plug-in is removed after the next restart.

You can also use security features to restrict access by the IBM SSR. Your IBM SSR can provide you with more detailed instructions for using the tool.

The embedded part of the IBM SAN Volume Controller V8.4 code is a software toolset that is called Remote Support Client. It establishes a network connection over a secured channel with Remote Support Server in the IBM network. The Remote Support Server provides predictive analysis of the IBM SAN Volume Controller status and assists administrators with troubleshooting and fix activities. Remote Support Assistance is available at no extra charge, and no extra license is needed.

1.7.2 Event notifications

IBM SAN Volume Controller can use SNMP traps, syslog messages, and a Call Home email to notify you and the IBM Support Center when significant events are detected. Any combination of these notification methods can be used simultaneously.

Notifications are normally sent immediately after an event is raised. Each event that IBM SAN Volume Controller detects is assigned a notification type of Error, Warning, or Information. You can configure the IBM SAN Volume Controller to send each type of notification to specific recipients.

Simple Network Management Protocol traps

SNMP is a standard protocol for managing networks and exchanging messages.
IBM Spectrum Virtualize can send SNMP messages that notify personnel about an event. You can use an SNMP manager to view the SNMP messages that IBM Spectrum Virtualize sends. You can use the management GUI or the CLI to configure and modify your SNMP settings.

You can use the MIB file for SNMP to configure a network management program to receive SNMP messages that are sent by the IBM Spectrum Virtualize.

Syslog messages

The syslog protocol is a standard protocol for forwarding log messages from a sender to a receiver on an IP network. The IP network can be either Internet Protocol Version 4 (IPv4) or Internet Protocol Version 6 (IPv6).

IBM SAN Volume Controller can send syslog messages that notify personnel about an event. The event messages can be sent in either expanded or concise format. You can use a syslog manager to view the syslog messages that IBM SAN Volume Controller sends.

IBM Spectrum Virtualize uses UDP to transmit the syslog message. You can use the management GUI or the CLI to configure and modify your syslog settings.

Call Home

Call Home notification improves the response time for issues on the system. Call Home notifications send diagnostic data to support personnel who can quickly determine solutions for these problems which can disrupt operations on the system.

Call Home email

This feature transmits operational and error-related data to you and IBM through a Simple Mail Transfer Protocol (SMTP) server connection in the form of an event notification email. You can use the Call Home function if you have a maintenance contract with IBM or if the IBM SAN Volume Controller is within the warranty period.

To send email, you must configure at least one SMTP server. You can specify as many as five more SMTP servers for backup purposes. The SMTP server must accept the relaying of email from the IBM SAN Volume Controller clustered system IP address. Then, you can use the management GUI or the CLI to configure the email settings, including contact information and email recipients. Set the reply address to a valid email address.

Send a test email to check that all connections and infrastructure are set up correctly. You can disable the Call Home function at any time by using the management GUI or CLI.

Cloud Call Home

Call Home with cloud services sends notifications directly to a centralized file repository that contains troubleshooting information that is gathered from customers. Support personnel can access this repository and be assigned issues automatically as problem reports. This method of transmitting notifications from the system to support removes the need for customers to create problem reports manually.

Call Home with cloud services also eliminates email filters dropping notifications to and from support, which can delay resolution of problems on the system. Call Home with cloud services use Representational State Transfer (RESTful) APIs, which are a standard for transmitting data through web services.

For new system installations, Call Home with cloud services is configured as the default method to transmit notifications to support. When you update the system software, Call Home with cloud services is also set up automatically. You must ensure that network settings are configured to allow connections to the support center. This method sends notifications to only the predefined support center.

To use Call Home with cloud services, ensure that all of the nodes on the system have internet access, and that a valid service IP is configured on each node on the system. In addition to these network requirements, you must configure suitable routing to the support center through a domain name service (DNS) or by updating your firewall configuration so it includes connections to the support center.

After a DNS server is configured, update your network firewall settings to allow outbound traffic to esupport.ibm.com on port 443.

If not using DNS but you have a firewall to protect your internal network from outside traffic, you need to enable certain IP addresses and ports to establish a connection to the support center. Ensure that your network firewall allows outbound traffic to the following IP addresses on port 443:

•129.42.56.189

•129.42.54.189

•129.42.60.189

You can configure either of these methods or configure both for redundancy. DNS is the preferred method because it ensures that the system can still connect to the support center if the underlying IP addresses to the support center change.

1.8 Useful IBM SAN Volume Controller web links

For more information about the IBM SAN Volume Controller-related topics, see the following web pages:

•IBM SAN Volume Controller support

•IBM SAN Volume Controller home

•IBM Documentation

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 1. Introduction and system overview

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 1. Introduction and system overview