Failover Solutions

In this chapter, the topic of failover solutions purposely follows the backup and recovery topics. Backup and recovery are the foundation for any well-run data center and are addressed at protecting and restoring business data. Failover solutions protect the application processes and protect against failures in the data center, so they are typically only implemented after the more basic operations and processes for data management are under IT administrative control.

If implemented correctly, failover solutions can help speed up or automate recovery from specific failure scenarios, such as hardware or software failures, helping maintain high levels of availability. If implemented and managed poorly, failover solutions can also be the cause of more downtime than if a stand-alone server with a few redundant components were used.

This section presents hardware HA clustering solutions with SAP, for both Unix and Windows NT/2000 configurations, with a focus on protecting the single points of failure within an SAP system. A database failover solution using a standby database is also presented. In addition, solutions to reach fault-tolerant levels of SAP application availability, including process and system mirroring, are presented toward the end of this chapter. In this book, the terms “HA clustering” and “failover” are interchangeable. Clustering, however, can also be used for performance, which is not discussed.

Vulnerable mySAP.com Components

The mySAP.com environment has many components, some of which are single points of failure in the access path to the business data. It is important to identify which of the components need protection with failover solutions.

To see the software SPOFs, a simple example of a mySAP.com-based e-commerce solution can be used. This can consist of the SAP BBP or Business-to-Business Procurement application with a catalog database, along with an SAP R/3 back-end system, perhaps with mySAP.com for Retail functionality and the SAP Online Store. The mySAP.com Internet or Workplace Middleware are needed for accessing the business applications via a web browser and for presenting the SAP Online Store web site. The mySAP.com Workplace Server is needed for central user administration. Figure 5-5 shows three different software layers for this e-commerce example and their critical, nonredundant components in dashed lines.

Figure 5-5. mySAP.com Components and SPOFs


For the mySAP.com Internet Middleware layer, the web servers and ITS WGate components can be made redundant through replication, so they do not require any failover or clustering solutions. Load balancing devices can be used to direct the web traffic to the available web servers, which is described in more detail in Chapter 12.

The ITS AGate, however, is slightly different. Multiple AGates can be used. For high-availability purposes, however, it is vital to separate the AGates on two different physical server boxes. Because they contain user session contexts within each server, some additional or higher level of availability could be considered beyond simply using two servers.

The mySAP.com Workplace Server may offer some form of its own replication, so it is shown twice. It could also be configured only once and reside in a failover configuration.

The mySAP.com Business Applications are typically built on the SAP Basis kernel architecture, so they each have application work processes (dialog, batch, update). Their single-points-of-failure components are the Enqueue and Message Servers, as well as each of their databases. These are the classic SAP system components typically included in hardware HA clustering configurations.

Clustering Basics for SAP

Clustering products can improve the availability of an SAP system, providing fast and automatic recovery of some types of failures. When implemented properly, there are some useful benefits to using clustering with SAP applications. Specifically, achieving a high level of availability with a limited but well-trained IT staff is possible. Many failures can be recovered from automatically, for example, and some upgrade scenarios become possible to do online by performing upgrades on the idle node(s) first.

A clustering solution does NOT guarantee application availability for all types of failures, however, nor does it provide more performance in the case of an SAP cluster. For example, logical errors within the database, such as corrupt or missing tables or data, are not protected. HA clustering solutions are designed only to restart the application(s), not to increase their performance. It's important to identify when clustering may be effective, and for what situations additional investments must be made to improve the availability of the application and the data.

Clustering solutions are not standard and do require a nontrivial effort to install, congure, and maintain. In addition, troubleshooting in a clustering environment adds another level of complexity, making it more difficult. Given this complexity, a troubleshooting procedure by SAP is to request, in cases where the software integration is suspect, that the SAP application be removed from the cluster, back to a nonfailover integration status. This may also be required in some upgrade situations.

What Is a Cluster for SAP?

A simple cluster solution for SAP uses two or more server nodes and a shared storage system to protect the software and hardware SPOFs. In case of a failure, the clustering software solution tries to restart the SAP or database applications on the remaining server nodes.

Clustering uses software heartbeats to detect failed applications or servers. In case of a server hardware failure, it employs a shared nothing clustering architecture that automatically transfers ownership of resources (such as disk drives and IP addresses) from a failed server to a surviving server. It then restarts the failed server's workload on the surviving server. (Another way to describe this is that only one server may have access or a lock on a shared resource, and the clustering software manages which server has access to which cluster resources.) All of this—from detection to restart—typically takes under a minute for basic resources, such as disks and IP addresses. If an individual application fails (but the server does not), the failover software will typically try to restart the application on the same server; if that fails, it then moves the application's resources and restarts it on the other server.

A clustering solution protects against hardware SPOFs, which are the nonredundant components, such as system boards, CPUs, memory, and so on, within each server node. Additional hardware SPOFs include the power to each server, the network connections if not redundant, and a few others. Operating system errors on one node, for example due to bad patches or service packs, can also be protected.

A clustering solution can also failover and restart the software SPOFs, which more specifically include the following for the SAP application:

  • SAP hostname and IP address resources

  • Shared disk with the /usr/sap directory structure for the Central Instance

  • SAP Central Instance application executables (Message, Enqueue, and others)

  • File share or NFS-links for the Central Instance profiles (SAPMNT)

  • File share for the SAP Transport Host, if used

The database can be considered as a separate application from a clustering management point of view. The specific SPOFs for the database are

  • The DB hostname and IP address resources;

  • The database application executables (running services, processes, agents, etc.);

  • The shared disk volumes for the logs and for the data files; and

  • The shared disk volumes for the executables (if not locally available).

These SPOFs are considered the resources to be managed in a cluster environment. These resources are grouped together in managed packages or resource groups.

Typical Cluster Configuration

Shown in Figure 5-6 is a typical HA cluster configuration for an SAP system, with two managed packages or resource groups. One package is the database group of resources (DB), and the other is the SAP Central Instance group of resources (CI). In addition, the hardware configuration has two server nodes, networks, and a shared disk system along with redundant SCSI or Fibre Channel I/O paths.

Figure 5-6. Typical Cluster Configuration


The DB package contains all of the shared disks, such as the log and data disks as shown, as well as the DB hostname, IP address, and application services. The SAP CI package contains the shared disk assigned uniquely to SAP (/usr/sap), plus its own hostname, IP address, and application services (application server work processes, Message and Enqueue Servers, etc.). Sometimes both the DB and the CI resources are managed as one complete failover resource group or package. This is simply a matter of configuration and is considered as the DB/CI package in this chapter.

Cluster Lock or Quorum Disk

An additional disk is shown, called the cluster lock or quorum disk. Each cluster solution must manage the exclusive locks of shared resources by the server nodes to guarantee the consistency of operation. Each shared resource can only be accessed by one server, regardless of how many server nodes are in the cluster. For example, if two servers were to write to a shared disk at the same time and location, data integrity would be compromised. A quorum or lock-management database is used to keep track of these resource ownership assignments.

This special quorum database can be maintained on one physical disk, as common in 2-node cluster configurations, but as such, represents a single point of failure. A more sophisticated way to protect this quorum or lock-management database is to distribute it redundantly among all of the server nodes in the cluster, which multi-node clustering solutions typically deploy.

Networks

There are typically three separate networks: public, server, and private. The private network is used for the cluster communications, often referred to as the heartbeat link. The server network is used in client/server configurations when a dedicated network between the database and application servers is needed. The public network is for the end- users' or client PCs.

Dual, redundant network links are recommended for the public and server networks. For the private cluster LAN, the heartbeat communication can be sent over the public and server networks as well, so only one private network link is needed.

Supported Clustering Software

On the Microsoft Windows NT/2000 platform, the Microsoft Cluster Server (MSCS) software product is supported by SAP for the core R/3 product, along with the mySAP.com Workplace system.

On the Unix platforms, the hardware vendors must develop, test, and support their own failover solutions with SAP. For example, on the HP-UX platform, MC/ServiceGuard is supported with extensions for SAP, with Linux support announced. For IBM RS/6000 Unix system, the failover solution is called High-Availability Cluster Multiprocessing, or HACMP. Sun deploys a third-party solution with SAP integration.

Additional Reading

For basic clustering concepts, most system vendors publish white papers and other documents describing their clustering solutions. Important additional reading is SAP's “BC SAP High Availability” document, which describes a general approach to the problem of making a system highly available. Additional HA-related documents from SAP can be found on the SAP Service Marketplace web site under the subject of “systemmanagement.”

Failover Software Groups or Packages

The integration of the SAP software into the clustering environment is a critical part of the failover solution. The SAP software resources managed as groups or packages in the cluster can be configured in different ways. The two most fundamental packages are those for the database (DB) and the SAP Central Instance (CI). Although the DB and CI groups can be managed independently, they can also be combined in one group or package (DB/CI).

The various mySAP.com Business Applications (SAP BW, SAP APO, SAP CRM, mySAP.com Workplace, etc.) each have a DB and a CI software SPOF. So, each would have a unique DB and CI cluster group or package to be managed.

In addition, a separate SAP application instance can be managed as one resource group or package (APP). This is different from the Central Instance because it only contains dialog, batch, and update processes, which are not SPOFs as long as there are other application server instances. An APP group or package is often used in an SAP cluster to more optimally make use of the available server hardware resources, but it is optional.

Sometimes the Test/QA and development systems are part of the HA cluster configuration in addition to the production SAP system. In larger implementation projects, it makes a lot of sense to have a separate HA cluster configuration for the development system because of the high cost of downtime with contracted application consultants. The Test/QA system is usually included in a cluster to make use of otherwise idle or extra hardware resources and provides a test environment for the cluster. This is optional, however, and the Test/QA instance is typically shut down to free up resources whenever a production system fails over to its server node.

Shown in Figure 5-7 are combinations of groups or packages supported in the SAP clustering environment. Cluster scenario (1a) is a two-package concept, where the DB and CI normally run on their own server nodes. During failure of any one of the server nodes or resource groups, the remaining server would then run both the DB and CI groups. In this case, the failover path is in either direction. Cluster scenario (1b) is a variation where the DB and CI resources are placed in one group or package and the failover path is in one direction only (one-package concept). The target host would be a standby system, either idle or running software that can be shut down immediately upon a failover of the DB/CI group to free up computing resources.

Figure 5-7. Supported Failover Packages or Groups


Cluster scenarios (1a) and (1b) are supported by the standard SAP integration with Microsoft Cluster Server version 1.0. MSCS in Windows 2000 Data Center supports up to four server nodes in a cluster, however, the SAP integration is still limited to a two-package (DB and CI) concept. Although needed for system consolidation, support for additional APP, DB, or CI groups or packages is not available within one SAP MSCS configuration at this time due to file share limitations in Windows NT/2000.

Advanced clustering solutions, such as those commonly found on Unix systems, support more failover scenarios or package configurations, such as those shown in the bottom of Figure 5-7. This is possible because the software control scripts for the cluster management are flexible enough to handle more types of packages or resource groups.

Cluster scenario (2a) is a multi-node server configuration, some of which are the critical nodes to protect (DB and CI), along with a designated adoptive node in case the others fail. To preserve the overall system performance, the adoptive server node runs software that can be shut down, such as an application instance or a Test/QA (DB/CI) system, in case of a failover to it. Although less common to do so, it can also be configured to remain idle and wait for a failure event. Each DB or CI could be a group or package for separate production SAP R/3, SAP BW, mySAP.com WPS, or other system. The amount of server nodes and individual packages supported in a multi-node cluster environment depends on the clustering software and hardware used. It is recommended, however, to use no more than eight server nodes in one SAP HA cluster to keep things manageable.

Cluster scenario (2b) is a variation of scenario (2a), except that multiple combined DB/CI packages are configured to fail over to one or more designated or adoptive nodes. Again, the number of DB/CI packages and server nodes supported is a limit of the clustering software and underlying hardware. For example, DB/CI package #1 could be for production R/3 system 1, DB/CI package #2 for R/3 system #2, DB/CI #3 for BW system #1, and so on.

What Happens with a Failover Event?

There are many types of failures possible in an SAP system with an HA cluster. Some of the failure events can be localized or are relatively minor; others bring the entire system to halt. This section describes the impact of failures on the SAP system.

Application Servers: SAP Logon Load Balancing

Although the application servers were not shown in Figure 5-6, it is common to have application servers outside of the cluster configuration. Application servers do not need to be in a cluster, necessarily, due to SAP's logon–on load balancing design. In case one of the SAP application servers fails, the user will be presented the choice to reconnect. In doing so, the subsequent logon attempt will be rerouted to one of the remaining available application servers. Thus, SAP application servers have built-in high-availability features and do not generally need to be part of a cluster configuration. This also means, however, that a minimum of two application servers should always be used. In case no loss of performance can be accepted, more application server processing power will be needed.

The moment the application server fails, the users logged on to it will loose their sessions and any open transactions. If they don't have a paper copy (or other form of history) of the data to be reentered, such as with telephone sales, then some business data will be lost. Using multiple application servers is one possible way to minimize the amount of data lost in this kind of situation. Even protecting SAP R/3's Central Instance will not help prevent open transactions on failed application servers from being lost.

Loss of Database Host

The Central Instance and the application servers will automatically reconnect in case the database server fails. This is controlled by settings in the application server's profile, along with using the latest SAP kernel patches. The failover software is responsible for ensuring the new database host uses the same IP address as before. Most clustering solutions with SAP integration support this DBRECONNECT feature.

When the database fails, some transactions may not have been committed and need to be rolled back once the database is restarted. This rollback recovery time can take anywhere from minutes to hours, depending on how many uncommitted transactions are in the rollback tables. This is a function of the type of transactions being run and the amount of concurrent users. Users cannot log on during this recovery time. Standard SAP transactions, without significant customization, will not take hours to run without making several commit statements in-between. However, it is possible for a customized transaction to run for hours without making a single SQL commit statement. If the database should fail before the completion of such an activity, all of the work must be rolled back, extending the database recovery time during which no users can log on. However, even on larger systems with hundreds or a thousand users, the database recovery time is normally less than 30 minutes.

Loss of Central Instance and Enqueue

The following things happen when the Central Instance with the Enqueue fails:

  • All open transactions, including batch jobs, are lost. This is because the enqueue table, which is a list of the exclusive write locks on portions of the database given to SAP update processes, is maintained only in memory of the CI cluster node.

  • All SAP users logged on to the CI lose their sessions and contexts, and must reconnect. Users attached to other application servers do not necessarily lose their sessions.

  • The SAP system remains unavailable to ensure that no user or update process can perform a write transaction to the database as long as the enqueue server is down. This is needed to ensure database consistency in the event of an enqueue failure.

  • After a failover, and after a successful restart of the enqueue server process, the SAP system provides a window of time to reset open transactions on the application servers. Normally all application servers reattach within this time, thus requiring no operator intervention or restarting of application servers.

In order to guarantee database consistency after a failover, all open transactions in the system must be aborted and rolled back before the enqueue server is restarted. Thus, all application servers with open transactions must be restarted or reset somehow, which the SAP system automatically does using a feature called TRANSACTION_RESET. When the CI returns after a failover, it resets the open transactions within all application servers to prevent any inadvertent writing into the database for locks that were granted before the CI failure. Because the application servers stay running, the performance of the system quickly reaches the same levels as before the failure because all of the application server buffers remain.

Before the DBRECONNECT and TRANSACTION_RESET features were available, the SAP cluster integration control scripts were designed to restart all of the application servers to guarantee data integrity of the system and to reconnect to it.

Because the impact of the enqueue server and table loss is so significant, alternatives are available for supporting the mirroring of the enqueue service and table, which allows the SAP operations to continue after a failover without losing open transactions with locks for DB updates. More detail is provided at the end of this chapter.

Performance and Sizing Impact

The sizing and performance impact of a failover depends on which groups or packages are configured to remain running on the adoptive server node and the expected performance level during a failure situation.

In the standard SAP on MSCS configuration [scenario (1a) in Figure 5-7], for example, the CI fails over to the DB server node (or vice versa), with both running on the same server. Both consume memory and CPU resources, which need to be accounted for. In most cases, degraded performance is accepted in a failover situation. If performance degradation is not acceptable, each server node must then be oversized, with both CPU and memory, to handle the failover situation. During normal operations, each server node would essentially only run at 50% of its capacity, or less.

For Unix clustering solutions with SAP integration, the failover scenario can be configured so that the surviving or adoptive server node shuts down any existing applications before starting up the failed over groups or packages. In this way, the performance of the DB, CI, or DB/CI package will eventually be the same as before the failure event given the same amount of hardware resources.

TIP

Sizing Approach for SAP HA Clusters

When sizing a two-tier central system for an HA cluster, simply add another server of the same type, essentially doubling the CPU and memory capacity needed. This ensures there's no loss in performance during a failover event and that there's plenty of performance during normal operations.

For a three-tier client/server sizing, simply add another server of the same CPU and memory capacity needed for the database server. This helps ensure no loss of SAP performance potential during a failover event. However, if a decrease in performance is acceptable after a failover, one of the application servers can be used as the adoptive cluster node for a failed over DB or CI package.


Hardware Notes for Clustering Configurations

For a cluster configuration to function as expected, the underlying hardware must be stable and reliable in complex environments. Hardware vendors spend significant efforts on testing and validating their disk, network, and server configurations used in HA clusters.

Disk Systems

The shared disk system is likely the most important investment to be made in a cluster. Not only does it store the valuable business data, but it is also a SPOF for the entire cluster because it is a shared resource. In a two-node cluster, if the shared disk system fails, the entire cluster fails.

Given the disk system SPOF, it is recommended to use only those with all internal components redundant, including the I/O paths and disk array controllers connecting to the server nodes. There should always be redundant I/O paths (SCSI or FC connections) from each server to the shared disk system(s), and redundant storage controllers with mirrored cache in each to maintain the high availability of the cluster and data integrity.

Multi-Initiator SCSI

The typical SCSI-based disk subsystem has one initiator (one SCSI controller or host bus adapter) and one or more targets (the disks). Shared disk systems for clustering are designed to support a multi-host (multi-initiator) environment. The shared disks must be able to support an environment where two or more server hosts have both read and write access to them, although the clustering software ensures this happens only one at a time. This requires more testing and verification than normal. Thus, it is recommended to use only disk storage system supported or certified for clustering environments.

Network

The cluster communication between the nodes (heartbeat) is critical to the functioning of the entire cluster. Thus, the private LAN for cluster communications usually needs to use a specific type of network card for testing and support reasons. Hardware vendors have their own list of tested and approved cards for this purpose.

To stretch a cluster so that the server nodes are hundreds of meters apart, a networking solution based on fiber optic cables is needed. The IP addresses used in the cluster must be in the same local IP subnet. This requires either using FDDI cards for the local subnet, or using standard Ethernet cards along with FDDI concentrators to connect the two sites. To go even further apart requires creative solutions, such as IP tunneling.

Redundant network cards (and hubs or switches) should be used in each server cluster node for the public network segments. Auto port aggregation or standard adapter fault-tolerance can be used. Consider also a high-availability LAN design using switch meshing (described in Chapter 10). Load balancing for performance is not required for most SAP systems, because the application load is typically larger than the networking load. In addition, load balancing may steal CPU resources needed for the SAP application.

Power

Although there are many environmental conditions to consider that may impact the operation of an HA cluster, the power source is the most important. Two power circuits should always be used in a cluster environment to remove electrical power as a SPOF. In addition, consider using uninterruptable power sources (UPS) with pass-through functionality within the data center environment or locally for each cluster component (servers, disks, and networks). The pass-through feature on UPS devices is helpful when the UPS battery fails but the line power is still available.

If the server has only one power input receptacle, then a power failure forces a cluster group or package failover. If the shared storage system has only one power input receptacle (power plug or cable), then this is a SPOF in a two-node cluster. Shared disk storage systems, therefore, must have redundant power input.

Microsoft Cluster Server HA Configurations

Microsoft Cluster Server (MSCS) is a clustering or failover solution available with Windows NT 4.0 Enterprise Edition, Windows 2000 Advanced Server, or Windows 2000 Data Center. MSCS version 1.0 supported a two-node server cluster configuration.

SAP has two specific requirements before any system will be supported by SAP on Microsoft clustering:

  1. The server and I/O controller must be SAP-certified with the normal Windows NT/2000 platform certification process (nonclustering).

  2. The same server and the same I/O controller must be certified by Microsoft's cluster certification program (on the HCL list).

Support for the software integration is provided by SAP, but the hardware vendors are still responsible for verifying that the basic clustering configuration is certified and supported. The Microsoft Hardware Compatibility List (HCL) for clusters lists the various server and storage systems tested and supported with MSCS.

SAP Integration with MSCS

SAP's development team wrote the installation scripts and the necessary resource APIs for managing an SAP R/3 central instance in MSCS. By offering only one standard cluster scenario, the two-package DB and CI resource groups, SAP has kept the support simple. Less flexibility also means fewer potential configuration errors, as is sometimes the case with the Unix-based clusters and SAP. Refer back to Figure 5-6, which shows a typical configuration supported with SAP and Microsoft Cluster Server. Currently, only SAP R/3, mySAP.com Workplace, and SAP APO (including liveCache) have official installation support for MSCS, but other components are planned.

In order to use the MS Cluster Server product, Windows NT Server 4.0 Enterprise Edition or Windows 2000 Advanced Server or above must first be used. One cannot simply add the Cluster Server software component to an existing standard edition server.

It is also required for the cluster server to be in an NT 4.0 domain, or in the active directory services with Windows 2000. A workgroup configuration will not allow MS Cluster Server to install. SAP highly recommends having a separate SAP NT domain for administrators anyway, so it can be leveraged. The cluster nodes should not be configured as Primary Domain Controllers (PDCs) or as Backup Domain Controllers (BDCs).

It is highly recommended, if not required, to have a trained MS Cluster Server consultant perform the installation. Ideally, this would be a trained SAP R/3 Windows NT/2000 basis consultant with MS Cluster Server experience.

The server configurations in the cluster are recommended to be identical for ease of administration. It is not required, however, to have the exact same amount of CPUs or memory in each cluster node.

The Cluster Quorum Disk

Microsoft Cluster Server version 1.0, in both Windows NT and Windows 2000, supported a cluster quorum database only on one physical disk, making it a cluster SPOF. A distributed cluster quorum database across multiple servers, such as that available in Unix clustering configurations, was not supported initially.

Recovering the MSCS SAP System

A disaster recovery of a single MSCS v1.0 configuration is a nontrivial act because of MSCS's dependencies on disk signatures. Disk signatures on the physical disks must match those in the Windows NT/2000 registry in order to function in the cluster. After a disaster, new disks will be used, and their new disk signatures won't match those from the restored Windows NT/2000 registry. There are tools and methods to recover this without data loss, however, it is a nonstandard process.

If SAP on MSCS is used in the production environment, the disaster recovery aspect must be well understood and prepared for. Make sure the consultant supporting the installation is familiar with the recovery techniques required. Microsoft hopes to improve the cluster backup and recovery process with a newer version of the clustering software.

Unix Clustering HA Configurations

Unix clustering solutions with SAP integration have a few more years of history and thus are more flexible and more capable of supporting advanced configurations.

One of the primary advantages of Unix clustering is the support of more server nodes and more software groups or packages configured within one HA cluster. For example, HP's MC/Service Guard supports HA clusters with up to 16 server nodes. A typical 16-node configuration would have a subset of disks connected to a subset of nodes and definitely requires Fibre Channel disk connectivity.

Although it may seem difficult to manage up to 16 cluster nodes in an SAP configuration, being able to use multiple adoptive cluster nodes is possible. Because application servers can be configured in the SAP cluster, they can act as backup adoptive nodes, thus ensuring a higher level of availability.

Redundant Array of Consolidated Servers (RACS)

It is possible to configure several consolidated SAP systems in a multi-node cluster. Multiple production server nodes are designed to fail over to a nonproduction adoptive node, as shown in Figure 5-8. The RACS concept can add flexibility in cluster configuration maintenance (thus increase uptime) by supporting rolling OS and clustering software upgrades. Because the RACS concept uses an adoptive cluster node whose resources can be made available to the other production nodes, there is no loss in performance compared with a standard cluster where all nodes are active.

Figure 5-8. Redundant Array of Consolidated Systems—RACS


This RACS concept may be interesting to outsourcing service providers or enterprise customers who need to save space in their data centers. It is designed for organizations deploying multiple smaller or mid-sized SAP systems, such as multinational organizations that run independent SAP systems for different divisions or country business units in one consolidated enter prise data center. It is also a viable concept with supporting many smaller mySAP.com business application components.

The SAP systems that can be consolidated are any of those with clustering support, including the DB and CI components of all SAP R/3 systems, SAP BW, SAP APO (including liveCache), SAP CRM Server, mySAP.com Workplace, and others.

The limits of such an approach are the cluster solution used, as well as the shared disk system and its connectivity. Up to the time this was written, only Unix clustering solutions with SAP integration and enterprise Fibre Channel-based storage systems could handle such a scenario.

Clusters-in-a-Box: HW Partitioning

With the high-end servers, it is possible to partition them into multiple, smaller nodes consisting of 4, 8, 16 or more processors. Multiple partitions could be used to participate in an HA cluster configuration. The server boxes themselves typically have some SPOF, even if minor. Thus using only cluster nodes from one physical server presents some risk, which needs to be evaluated. For mission-critical SAP systems, it is recommended to configure the cluster with nodes that are from at least two independent servers.

Campus Clusters

A cost-effective way to stretch a cluster across a campus or larger site, up to 10km, is to use software RAID 1 for the shared disks. Figure 5-9 shows an example campus cluster configuration with the DB and CI package or resource groups, which is in production at several SAP customers.

Figure 5-9. Campus Cluster


The shared storage system is mirrored from within each server cluster's OS. Thus, the cluster quorum or lock disk is also mirrored. Dual, redundant Fibre Channel paths are used between the servers and the storage, and FDDI is used for the cluster IP networks to be in the same IP subnet at the 10km distance.

Because it requires software RAID 1 to function, it requires a reliable file system. Presently, such a configuration is only available on Unix clusters. Windows NT or 2000 clusters with MSCS v1.0 cannot support software mirroring with the given file system. Microsoft has announced support for a Veritas file system with clustering, which would allow this campus cluster configuration.

TIP

Fibre Channel and Fiber Optic Cables for Campus Clusters

When using an FC arbitrated loop, there may be significant performance (throughput) decay over longer distances, as described in Chapter 4. Either keep the distance to a minimum or consider using FC switches. In addition, single-mode fiber optic cables are needed between the data centers. When installing the fiber optic cable, consider reserving some for future use in case more cluster configurations are needed.


This solution is cost effective because the shared disk systems can be standard mid-range systems, not enterprise storage systems, and only two server nodes are required. Because software RAID 1 is used over a large distance, it is important to use a fast storage system with a large cache along with an optimum database layout to avoid unnecessary I/O delays, helping keep the database response times low. The only drawback of this solution is that it cannot protect against the split-brain syndrome (explained next).

Metro Clusters

The big brother configuration to the smaller campus cluster is the Metro Cluster, designed to span citywide or metropolitan distances (less than 60 km). This overcomes the issues with soft ware RAID 1 by using enterprise storage systems and their remote copy functionality (essentially hardware level RAID 1 over larger distances). A Metro Cluster is designed for an automatic fail over in a disaster recovery environment (see Figure 5-10). It is a proven solution that offers one of the highest levels of availability a hardware-based clustering configuration can offer and is in production by many SAP customers.

Figure 5-10. Metro Cluster


An example solution would use the MC/ServiceGuard clustering software on HP-UX along with HP's SureStore E Disk Array XP series and its Continuous Access remote copy functionality using ESCON fiber optic links. EMC's Symmetrix enterprise storage systems are also supported in this configuration, along with the Symmetrix Remote Data Facility (SRDF) remote copy functionality. The important function of this solution is to automatically switch the remote DR storage system into read/write mode (write-protect turned OFF) so it can properly fail over the database in case of a primary site failure.

There are at least six server nodes configured in the Metro Cluster, although more are allowed. Two are in the primary data center, two in the disaster recovery data center, and two additional servers are needed in a third location to act as cluster arbitrators. The arbitrator servers are required because there is no centralized cluster lock disk or quorum disk when using a split-site cluster configuration. The arbitrator servers can be used for other tasks, because the arbitration load on the system is relatively low. The network used for the cluster contains a local TCP/IP> subnet and thus is limited to metropolitan distances. The network to the third data center should be used with a different network provider than between the first two data centers.

Another reason for only supporting metropolitan distances is the need to synchronize the disk write-I/O commands. Only when both storage systems have written the I/O into their cache and acknowledged it is the I/O cycle complete. This would have a significant impact on performance if the distance between the storage systems were too far apart.

Although the Metro Cluster configuration looks complicated at first, it is a proven solution with real benefits in disaster recovery situations. It can also be combined with the RACS concept, described previously in this chapter, for high-availability system consolidation.


Split Brain Syndrome

With geographically split data centers, the communication links between the cluster nodes may go down, yet the cluster nodes may remain functioning. In this case, each cluster node thinks the other is down because the cluster heartbeat isn't able to make contact with the remote server node(s). Thus, each attempts to take over the shared disk resource(s), which may result in a data integrity problem more difficult to recover from.

This situation is referred to as the split-brain syndrome and is an additional reason for requiring arbitration server nodes in a third data center to ensure membership consistency of the cluster. If there were only two data centers with voting rights for the cluster quorum, then each would have only a 50% vote, which is not a majority. Without a majority vote of which cluster nodes are up or down, automatic failover cannot function properly in some situations.

The arbitration server(s) can be Unix workstations or servers running other applications. They simply must run a small background task that is an arbitrator or witness whenever cluster communication or failure events occur.

DR Clusters with Microsoft Cluster Server

The Metro Cluster configuration is not possible with the first release of Microsoft Cluster Server. However, a disaster recovery configuration can be made with the use of the enterprise storage remote copy functionality. The configuration, as shown in Figure 5-11, can use two cluster server nodes that are configured to use the primary storage system. The secondary storage system maintains a copy of the database volumes, but in an unshared, read-only mode. During normal operations it is not visible to the server nodes.

Figure 5-11. MSCS Disaster Recovery Configuration with SAP


If the primary storage system fails, the remote or standby storage system's disk volumes can be manually set to read-write or primary-mode by an IT administrator. In case of a disaster of the primary data center, a manual failover is appropriate. In case of a loss of cluster communications (the split-brain syndrome), a manual failover is also required. This configuration's failover cannot be fully automated as the Unix-based Metro-Cluster configuration because of MSCS version 1.0's need for one physical quorum disk. Certain disaster situations would require manual intervention. However, some storage system vendors provide scripts or MSCS integration that can be used to automate the storage system failover (or mode switch) when only the primary storage system fails.

Continental Clusters

Clusters across greater distances than metropolitan areas can be supported with SAP clusters, which may be interesting for organizations that need a disaster recovery solution beyond the immediate geographic region. This helps protect against environmental events that affect an entire area, such as hurricanes, earthquakes, and other disasters.

This solution uses ESCON connections over the WAN to support making physical disk copies and thus can support both continental and intercontinental distances. The longer distances require the Continental Cluster solution to employ asynchronous disk I/O, otherwise the DB I/O request time would take too long. The I/O is acknowledged as soon as the local disk storage system successfully writes it into its cache, without waiting for the second disk system to acknowledge. This has a higher risk associated with it because the most recent I/O may not be consistent in case of a failure, but distances up to the network limits are supported.

Figure 5-12 shows a Continental Cluster configuration. It supports two clusters, one local to each data center. Each cluster has its own IP subnet and two power circuits. Both clusters are configured from an application integration perspective, but only the primary cluster has access to the production data. A cluster lock or quorum disk is supported, such as with MSCS configurations, but not for larger multi-node clusters.

Figure 5-12. Continental Cluster


The ESCON over WAN links are slower than pure ESCON or FC channels, so full data replication of entire disk volumes is not feasible in this configuration. Backup and tape restores are needed to initially synchronize large data volumes and for recovery. Thus, the failover is typically in one direction and so is designed for fewer, but real disaster scenarios. It is not an automatic recovery. Some manual intervention would be required, as is often acceptable in case of disaster.

Because the primary and remote cluster have their own IP subnets, the failover of the entire SAP environment, including application servers, is nontrivial. The entire SAP configuration would have to be configured to use the DR cluster network names and IP addresses in the event of a failover. It makes sense to have a few SAP application servers preconfigured for the DR environment located in the recovery data center.

Database Failover

Several of the clustering configurations presented so far have been based on making physical copies of the data. Database failover solutions, however, make logical copies of the data, bringing data protection to a different level. Failover to a standby or recovery database server can be effectively used as an alternative to HA clustering in SAP production environments. More commonly, however, it is used in addition to HA clustering.

Remote Standby Database Server

A cost-effective way to decrease the recovery time after a local disaster is to use a remote, standby database server (also called shadow database). This solution is based on sending the log files to a remote server that runs the database in recovery mode (see Figure 5-13). This solution can be used to recover from disasters, to provide recovery from logical errors, for fast restores (with a high-speed network), and for decoupled backups.

Figure 5-13. Remote (Shadow) DB Mirror Server


If needed, such as due to a disaster at the primary data center, the standby database can be recovered up to the last available log file and set online in read/write mode. This helps reduce the time needed to restore (no tape restore needed), and it can be used while the primary data center is being built up again.

This solution not only requires two copies of the database software but also two database servers and storage with identical copies of the data at the outset. This is usually achieved with a full backup and restore. The remote database server can be smaller than the production server, along with a more cost-effective disk layout, as long as lower performance during the recovery time is acceptable. In a disaster situation, the SAP application servers would need to be pointed to the new database host, requiring changes to or a special disaster recovery (DR) set of the SAP startup profiles. This does require manual intervention to set the DR database server out of recovery mode into online read/write mode. However, it does not require investment in clustering technology, thus the overall administrative effort may be smaller.

Only the archived or inactive logs are sent to the remote system. In the case of a disaster to the primary database server, the recovery on the remote server can only be up to the last archived logs. However, if the open online logs can be recovered from the primary database server, then they can be applied to the standby database server for a more up-to-date recovery. Another consideration is that a full recovery requires synchronizing the entire database back to the primary database server, either via a fast network or SAN connection or with a tape restore.

One of the popular benefits of this solution is its protection against logical errors. Logical errors can destroy the integrity of the primary database, plus any of its instantly mirrored copies or data volumes. Logical errors can come from both hardware problems and from user or ad ministrator actions and can be just as common as hardware or environment failures. A remote shadow database server can recover the logs in a time-delayed fashion, up to 24 hours later, if needed. Logical errors can thus be prevented if they are quickly uncovered during the time window before the shipped logs are recovered into or applied to the standby database. This solution can be used in addition to a failover or cluster configuration, which is better at protecting hardware failures.

TIP

Logical Error Recovery

A remote standby database server is an effective way to quickly recover from logical errors. If this solution is not used, a restore from the last valid backup must be made.


Running the standby database server in recovery mode requires some utilization of a processor during normal, non-DR times. It's a good idea to reserve one CPU for this activity, while the other CPUs can be used for an SAP application instance, further leveraging the investment in hardware for DR purposes.

This solution is available in a basic form with the latest versions of the database platforms. For Oracle, this is called remote archiving. For Microsoft SQL Server, it is called log shipping. Informix also provides this functionality. One drawback worth mentioning is that structural changes to the database cannot be accounted for, such as adding or dropping tablespace or data files. These types of changes are not typically recorded in logs or archive mode, so they are not sent to the standby database. A third-party tool is available from www.libelle.de for several databases that monitors the system or DB catalog files for structural database changes, and it provides more control over the time delay of the log recovery process.

Higher Levels of SAP Availability

To achieve higher levels of availability than a hardware clustering solution for SAP provides, additional technology is required to protect the SAP software SPOF—the Enqueue. For Windows NT/2000 systems, fault-tolerant, hardware mirroring solutions are available that can make sense to use in specific configurations. For Unix and Windows NT/2000 systems, a fault-tolerant process mirroring solution is available to protect the SAP SPOF (Enqueue), achieving continuous availability of the SAP application, even during kernel upgrade situations.

When the Enqueue Server and Table are protected from failure, the availability of the SAP system increases by ensuring its continuous operations. The benefits include the following:

  • User productivity is maximized by reducing the SAP system downtime.

  • Users logged on to the Central Instance are not required to reenter lost in-process or open transactions, reducing the risk of data-entry errors.

  • Long running batch jobs with locks on DB updates are not lost.

  • More flexibility to maintain the cluster HA configuration is provided.

  • Rolling SAP kernel upgrades can be made without system downtime.

Hardware Mirroring—Toward Fault-Tolerance

Clustering solutions require software integration with the SAP and database application, making them vulnerable to downtime due to software configuration consistency errors. A fault-tolerant, hardware mirroring solution is available that does not require the use of any software scripts or other integration with SAP, nor does it require any clustering software.

A fault-tolerant solution is a method for achieving very high levels of availability. It is characterized by redundancy in most of the hardware components, including each server's components (CPU, memory, system board, I/O subsystems, etc.) and the data center environment in which the server(s) are located. A fault-tolerant system has the ability to continue service in spite of a hardware or software failure.

HP's NetServer Assured Availability solution is an example of such a solution and is based on Marathon's Endurance hardware mirroring technology using Windows NT Enterprise Edition and Windows 2000. Because it requires no software integration with SAP or the database, only the standard SAP installation kits are required.

This solution, as shown in Figure 5-14, uses four server components—two Intel-based servers consisting of processor and memory (compute elements) and two Intel-based I/O servers (I/O elements). Both compute elements (CEs) process the same work and they appear as a single server. These are the fault-tolerant elements and are synchronized or in lock step with each other. Because of the technical difficulty of keeping multiple processors in sync across servers, only two processors in each CE were supported at the time this was written, although more were planned. If either CE fails, the remaining compute element will complete the work. When the failing element has been repaired, it is transparently reconfigured back into the system without disrupting its operation or clients. To reconfigure a CE node back requires copying the memory contents of the surviving CE node to the new one, which may take a few minutes during which no activity can take place (a frozen state). This, however, can be scheduled during less critical times.

Figure 5-14. Hardware Fault-Tolerance: HP NetServer Assured Availability Solution


Two copies of critical data are stored, one on each I/O processor, ensuring that if one I/O processor fails, a copy of the data can be obtained from the other I/O processor. The I/O processor can be any type of standard Intel-based server tested or supported in the offering. Hardware upgrades can occur without server disruption, and in the case of a disk failure, data can be restored to a replaced disk as a background operation. This solution can also be configured over split locations to provide processing and real-time I/O duplication between rooms and buildings located up to 500 meters apart.

This solution has no single points of failure. In addition, it has the ability to isolate a failure and continue to operate without the failed component causing any loss of transactions or data. Moreover, it uses off-the-shelf Intel-based servers running an unmodified version of the Windows NT/2000 operating system.

There are a few limitations with the initial versions of this solution. Because it is based on standard Windows NT or 2000 and IA-32 processors, there are some memory limits. Second, only specific servers are qualified with this solution, which have their own limitations in the number of processors and memory supported. Specifically, the compute elements initially only supported a small number of processors. The I/O processors are not fault-tolerant, so any applications running only on them will not fail over to the other I/O processor.

Usage with SAP

Although this solution has hardware configuration limitations, there are still some valid SAP with Microsoft Windows NT/2000 scenarios worthwhile to consider.

  • The CE can protect a stand-alone SAP central instance, whether for SAP R/3, mySAP.com Workplace Server, or any other application with an SAP kernel. It should be set up primarily to protect the enqueue and message services. The dialog, batch, or update activity should be kept to a minimum, otherwise the smaller compute elements may become a performance bottleneck. The database would need to be in a separate HA cluster. The I/O processors can also be used as application servers, because they don't need fault-tolerant protection.

  • The fault-tolerant CEs could be used to protect the ITS System's AGate function in the mySAP.com environment, if needed.

  • Once the compute elements support four or six processors and 8GB memory with Windows 2000 and AWE or 64-bit, it may be interesting to protect the SAP APO liveCache (a compute and memory intensive application). This level of hardware support, however, is not available currently.

Running on the CEs an application that makes a lot of I/O requests, such as an SAP database, although technically possible, does not make as much sense. It will be slower because of the extra communication needed between the compute elements and the I/O processors (where the database log and data disks are located).

Process Mirroring

Although a hardware mirroring solution provides fault-tolerant capability to many of the SAP single points of failure, a more flexible fault-tolerant method is to protect the Enqueue Server (and thus the Enqueue Table) with process mirroring technology. This method is more flexible because it does not depend on any specific hardware configurations—any supported servers can be used, regardless of the number of processors in each. In addition, recovery from failure is fully automatic.

The process mirroring solution for the SAP Enqueue Server depends on an application-programming interface (API) in the SAP R/3 kernel codeveloped and tested by SAP and HewlettPackard. HP's Somersault solution was the first on the market to provide this Enqueue Server protection with the use of process mirroring.

Figure 5-15 shows the important components of the HP Somersault Mirrored Enqueue Server Protection Unit solution for SAP systems. The important new functionality in the SAP kernel is that each work process, from both the Central Instance and the application instances, can access the Enqueue Server via the new Enqueue Gateway process.

Figure 5-15. Enqueue Process Mirroring—HP Somersault


As shown, the Enqueue Server and Table are replicated on two servers that do not participate in a cluster, along with a Witness server that may be part of the cluster. Communication between an Enqueue Gateway and the Enqueue Server protection unit is controlled by the HP Somersault protocol. Communication between the work processes and an Enqueue Gateway uses fast interprocess communication based on domain sockets, as provided by SAP. An Enqueue lock is not successfully granted until both the primary and secondary Enqueue Servers have acknowledged it.

Only the Enqueue Server has been removed from the Central Instance (CI). The CI still exists with the other SAP SPOFs, including the stateless Message Server and the file shares. Thus, it must still be protected with the standard clustering solutions.

Database Protection

The total high-availability solution for SAP requires not only En queue protection but also protection against database failure. The traditional method of database protection is to use an HA cluster solution. To put things in perspective, a failover of the database is not instant because it requires some recovery time if any uncommitted transactions are still in the rollback files. The recovery time can be influenced by the programming or customization methodology, as well as by having lots of CPU, memory, and disk I/O resources available to recover. Comparatively, having lost the Enqueue Server and Table will likely require reentering lost transactions and resubmitting lost batch jobs. This may take significantly longer than the database recovery process and is more difficult to gauge or influence.

A potential way to avoid the database recovery time is to consider using a parallel database solution for HA or mirroring purposes. Parallel databases for SAP R/3 have not shown great benefit in the area of performance improvement, but it may be possible to limit their use to high-availability requirements (although not common). For example, Oracle Parallel Failsafe (OPFS) can be used to help reduce failover time. It is based on Oracle Parallel Server and uses two database instances that are up and running at the same time on different machines, being attached to the same data source. In normal operation, only one of the OPFS instances is active, the other idle. Thus, the system does not behave like a parallel database. If one instance fails, the other immediately takes over, and the failover is faster because a cold-start of the Oracle instance is not needed. This solution does not prevent the rollback of open transactions, however.

Example Configuration Landscape

Figure 5-16 shows an example system landscape with two SAP clusters and the two Enqueue protection servers. At the present time, the two servers used for the Enqueue replication or protection must use a nonrelocatable (nonvirtual) IP address and therefore must be located outside of the clusters. They can be configured to run as normal application servers to leverage the investment.

Figure 5-16. Enqueue Mirroring System Landscape with Two Clusters


The witness service simply runs on one of the cluster nodes and is configured to rollover to the adoptive cluster nodes in case its server node fails. Moreover, additional servers can be predesignated to activate an Enqueue Server replica for more protection. Table 5-7 describes what happens in case of failure of the various SAP components.

The Enqueue Server process-mirroring solution (based on HP Somersault) is integrated with SAP's CCMS. The configuration can be monitored indicating which machines have been con fig ured to execute the Enqueue Server on as well as their status.

Table 5-7. Enqueue Process Mirroring Failure Scenarios
Failure of: Scenario:
Application Server Those users on the failed application server lose their sessions and must reconnect to a surviving application server. Their open transactions must be reentered. In this case, there is no global system impact. However, some loss of transactions local to the failed application server may occur.
Database Host All users and SAP instances remain running, but in-process batch jobs and dialog activity are halted until the database fully recovers. No data is lost.
Central Instance The message server, virtual IP address, file shares, and other CI resources fail over to the adoptive cluster node and are restarted. Any users logged on to the CI must reconnect and reenter their last dialog transactions. No batch jobs or dialog transactions started on other application servers are lost, nor must they be restarted because the Enqueue Server remains running.
Primary Enqueue Server The Enqueue Table remains protected by the Secondary Enqueue Server, so the system remains running without any loss of transactions. If predesignated, it will automatically attempt to bring up another Enqueue replica.
Secondary Enqueue Server The Enqueue Table remains protected by the Primary Enqueue Server, so the system remains running without any loss of transactions. If predesignated, it will automatically attempt to bring up another Enqueue replica.

This solution is available for SAP R/3 releases 4.0B and newer. Any other of the mySAP.com components that are based on the SAP R/3 4.6B or newer kernel can take advantage of this process-mirroring solution. This solution is available for HP-UX, Microsoft Windows 2000, and Linux.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.9.115