Chapter 8. High Availability

Today's enterprise applications have evolved from smaller proprietary systems to integrated applications that are always available. In most Microsoft Dynamics AX implementations, we see that there is a multi-company setup where there is a difference in business critical hours between those companies. During business critical hours, the software must be capable of handling different kinds of loads. Outside these hours, there are often nightly processes such as inventory replenishment that are running. These processes too can put a load on the system. Even companies that are based in a single location cannot afford to have much downtime because they run 24/7; for example, hospitals or factories.

All of this results in the need for a system that is available at any time.

In this chapter, we will take a look at precisely that. Starting with a very simple setup, we will modify the architecture of Microsoft Dynamics AX so that it can handle higher loads while avoiding single points of failure.

The first goal of this chapter is for Microsoft Dynamics AX professionals to be able to recognize situations in which a high availability setup is desired. The second goal is for you to be able to configure Microsoft Dynamics AX for high availability. In this chapter, we will cover the following topics:

  • Introducing high availability: We will start by defining what high availability is and how it relates to redundancy and disaster recovery.
  • Application level load balancing: Starting with a very basic architecture, we will gradually add components until all Microsoft Dynamics AX components are in place to achieve high availability of the services.
  • Network load balancing: The final components that we will add to the architecture are network load balancers. These are vital components, especially if you want high availability of services, because they take care of load balancing WCF communication.

Introducing high availability

High availability (HA) means creating a system design for your Microsoft Dynamics AX components in a way that ensures that the system is up and running as close to 100 percent of the time as possible at an acceptable level of performance.

For anyone working with Microsoft Dynamics AX, it is obvious that this is no easy feat. Installing fixes in a best practice way requires restarting the AOS, which means that many components are unavailable. Rolling out one fix per month with a downtime of 5 minutes, for example, would mean a system that is running only 99.99 percent of the time. This, however, can be planned and should have little impact on the business.

However, there are many occasions when the system is unavailable that can't be foreseen, such as the following:

  • Power outage
  • Server crashes
  • Hardware failure
  • Network outage
  • Security breaches

Fortunately, there are ways to deal with these problems, most of which might already be known to you, such as using Redundant Array of Inexpensive Disks (RAID) to protect the system against disk failures or using Uninterruptable Power Supply (UPS ) to protect the system against power outage. Such automated systems can be complemented by defining procedures that need to be performed manually.

Adding redundancy

A chain is only as strong as its weakest link. In terms of system design, links are the components of your system. When adding redundancy, you strengthen these links by avoiding single points of failure.

Ironically, adding extra components might undermine your efforts to create a high availability environment. These components might just increase the number of points of failure, so consider each component carefully.

Adding redundancy opens the door to load balancing. Adding multiple AOS instances to a cluster is an example of load balancing. This will balance the load over these different instances, adding to the performance of the system. If one of the AOS instances fails, the others will still be available, adding to the reliability of the system.

Another example is a SQL Server in an active/passive mode. Only one SQL Server instance is active at a given time in that configuration, but in case one fails, a failover occurs and the other instance is used.

Disaster recovery

Disaster recovery (DR) comes into play when HA fails. This could happen because of natural causes such as fire or flood, human errors, or errors that are introduced on purpose. After a disaster occurs, DR strives to restore the system to a previously acceptable state as soon as possible. This could mean doing simple things such as restoring backups and restarting services, but it could also mean moving all operations to a different physical location altogether. In the case of DR, the level of performance is less important, as its first priority is to restore the system to an operational state.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.107.85