Identifying the Purpose and Characteristics of Fault Tolerance

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

3.11. Identifying the Purpose and Characteristics of Fault Tolerance

Fault tolerance can be defined as the ability to lose a network component without losing data or functionality in the network. There are many different types of fault tolerance in a typical network, including fault tolerance for power, communication links, storage devices, and network services. In this section, we will discuss the various forms of fault tolerance for a network.

3.11.1. Critical Information

You should know the purpose and characteristics of fault tolerance in a computer network. More specifically, you should know the purpose and characteristics of fault tolerance in regard to power, communication links, storage devices, and network services.

3.11.1.1. Fault-Tolerant Power

Fault-tolerant power usually comes in the form of an uninterrupted power supply (UPS). A UPS is essentially a battery backup that provides power to the computer in the event that the power from the power company fails. Typically a UPS is used as a temporary solution to allow servers and sometimes clients to be shut down in an orderly manner, to prevent data loss. A UPS is typically a device that contains a battery and a power conditioner. It is plugged into the power outlet so that the computers and other essential hardware can then be plugged into it. Typically, less essential devices that pull a great amount of current, such as laser printers, are not plugged into the UPS. In very large organizations that must continue to operate, a bank of batteries is used and a generator is also provided in the circuit to continually charge the battery so that the computer systems can continue to operate.

3.11.1.2. Fault-Tolerant Communication Links

The main purpose of a network is to provide a way that computers can share resources, such as files, folders, and printers. Once a network is established, people begin to rely on the network being in place and operational. For this reason, many organizations provide multiple communication links between users and essential resources. Some organizations use multiple links that are of the same bandwidth to provide traffic load balancing as well as fault tolerance. Other organizations simply provide a less expensive backup link that can be used if the primary link should fail. For example, an organization might use a T1 communications link as its primary link but have an Integrated Systems Digital Network (ISDN) link as a backup. The main purpose of providing fault-tolerant communications links is to ensure that network users can remain productive even if one of the links fails. This is especially important in organizations that absolutely must be able to communicate, such as law enforcement agencies and hospitals.

3.11.1.3. Fault-Tolerant Storage

Fault-tolerant storage on a network is typically accomplished by having multiple physical drives. This is usually done on servers rather than clients. While there are many different forms of fault tolerance in regard to hard drive configurations, the general term used to describe all types of fault tolerance obtained by multiple physical drives is redundant arrays of inexpensive disks (RAID). Many forms of RAID exist, but we will focus only on the most common forms of RAID that you might encounter and those that are likely to be on the test. The most common forms of RAID configurations are RAID 0, RAID 1, and RAID 5. In the paragraphs that follow, we will discuss each of these RAID configurations.

3.11.1.3.1. RAID 0

Actually, it's not even appropriate for us to be discussing RAID 0 under the context of fault-tolerant solutions, since it provides no fault tolerance. In fact, the only reason that we're discussing RAID 0 at all is to provide a contrast between it and the other most common forms of RAID, which do provide fault tolerance. RAID 0 works by spreading data over multiple disks. The only real advantage of this configuration is that it provides increased input performance over that of a single disk. This is especially evident when multiple controllers are also used. The dis-advantage of RAID 0 is that it not only does not provide fault tolerance but it actually increases the risk to data contained on the disks, since a failure in any one of the disks will result in a loss of all of the data on all of the disks. Typically, RAID 0 is used in temporary situations when it is necessary to move large amounts of data as quickly as possible and where other data-recovery solutions, such as backups, can be performed in tandem with moving the data.

3.11.1.3.2. RAID 1

RAID 1 is a fault-tolerant disk configuration that writes the same data to more than one hard disk. The data is written to both hard disks so that even if one disk fails the other disk will still contain the data. If the disks both use the same controller card, then the configuration is referred to as disk mirroring. If each disk has its own controller card, then the configuration can also be referred to as disk duplexing. The advantages of RAID 1 are that it provides fault tolerance and it can be used on all partitions, including those that contain the system and boot files. The main disadvantage of RAID 1 is that it is only 50 percent efficient in regard to data storage. For example, if you want to store 40GB worth of data in a mirrored configuration, you will require a total of 80GB worth of free disk space. RAID 1 configurations are therefore mostly used to provide fault tolerance to the partitions that contain the system and/or boot files on a server and other more efficient methods, such as RAID 5, are used to store large amounts of data.

3.11.1.3.3. RAID 5

RAID 5 is a fault-tolerant disk configuration that provides an efficient method of storing large amounts of data. In a RAID 5 configuration, data is striped across multiple physical disks arranged in an array while at the same time an additional copy of the data is distributed across all of the disks in the array. The additional data is called parity data. It can be used to rebuild the entire array if any one disk should fail. This parity data is distributed across the disks in such a way that if any one disk fails the parity data on all of the other disks replaces the data lost on the failed disk.

The main advantage of RAID 5 is that it provides a much more efficient method of storing large amounts of data. In most systems, you can use as few as 3 physical disks and as many as 32 physical disks. Typically, the number of physical disks used in RAID 5 does not even begin to approach the number 32. This is due to the fact that the more disks that are in the configuration, the greater the chance of losing not just one disk but two disks at the same time. If this should happen, then all of the data on all of the disks would be lost. For this reason, most RAID 5 configurations contain between 5 and 10 physical disks.

The efficiency of data storage, however, increases with a number of physical disks. For example a RAID 5 array that contains 4 100GB disks could store 300GB worth of data, because 100GB will be used for parity data. This is an efficiency ratio of 75 percent. On the other hand, a RAID 5 array with ten 100GB physical disks could store 900GB worth of data, because still only 100GB will be used for parity data. This is an efficiency ratio of 90 percent. Typically, an organization chooses a balance between an acceptable efficiency ratio and an acceptable risk in regard to the number of physical disks in the array.

3.11.1.4. Fault-Tolerant Network Services

Most networks rely on specific services in order for the network to function properly. Services such as IP address assignment and name resolution are absolutely essential in order for users to remain productive in a network. Because of this fact, most network designers build fault tolerance into their network designs in regard to these services. You can have multiple DHCP servers as long as they don't give out the same IP addresses. For name resolution, you can have multiple DNS servers and multiple WINS servers. You can configure each client with the IP addresses of multiple DNS and WINS servers. In fact, you can even combine the use of network services and allow the DHCP server to configure multiple WINS and DNS addresses on each client.

In addition, it is possible to provide fault tolerance for an entire server. This is accomplished by grouping it with another computer or even several other computers so that each computer has a connection to the centralized data between all of the computers. This type of configuration is referred to as a cluster. The main advantage of clustering is that it provides tremendous fault tolerance and therefore tremendous availability of the clustered resources. The main disadvantage of clustering is that it is very expensive to implement and maintain. Because of this, clustering is primarily used only by very large organizations or by organizations that absolutely must have their data, such as law enforcement agencies and hospitals.

3.11.2. Exam Essentials

Know the purpose and characteristics of fault-tolerant power. Fault-tolerant power usually comes in the form of a UPS. Most UPSs consist of a power conditioner and a battery. The UPS is generally plugged into a power outlet and charges a battery that is used to power a client or server so that the server will continue to run off the battery for a small amount of time in the event of a power failure. The main purpose of the UPS is to provide temporary power and allow the computer to be shut down in an orderly manner to prevent the loss of data.

Describe the characteristics of fault-tolerant communication links. Fault-tolerant communication links are often used in organizations to ensure that clients can continue to have access to resources if one link fails. Some organizations also use these multiple links for load balancing of traffic, and other organizations simply provide a standby or backup link. Redundant communication links are most important when the organization absolutely must have a communication link even in the event that their primary link fails.

List the characteristics of fault-tolerant storage. Fault-tolerant storage is usually accomplished using multiple physical disks in RAID arrays. Three main levels of RAID are typically used in today's networks: RAID 0, RAID 1, and RAID 5. RAID 0 actually provides no fault tolerance.

Know the characteristics of RAID 1. RAID 1 refers to disk mirroring or disk duplexing, depending on whether you are using one controller card or two. The main advantages of RAID 1 are that it provides fault tolerance and that it can be used on all partitions, including the partitions that contain the system and boot files. The main disadvantage of RAID 1 is that it is only 50 percent efficient in regard to data storage.

List the characteristics of RAID 5. RAID 5 is a fault-tolerant solution that stripes data onto multiple disks along with parity data that is distributed throughout all of the disks. The parity data is distributed in such a way that the loss of any one disk can be replaced by the parity data distributed across all of the other disks. A loss of two disks at the same time would result in a loss of all of the data on all of the disks. Most organizations prefer RAID 5 over RAID 1 for storing large amounts of data because it is much more efficient than RAID 1.

Know the characteristics of fault-tolerant network services. It is possible to provide fault tolerance on all of the most essential network services, such as DHCP servers, WINS servers, and DNS servers. Network clients can be configured for multiple network services so that they are provided with fault tolerance in the event that one machine fails. Most network servers can also be clustered so as to provide fault tolerance for the entire machine. The main advantage of clustering is a high degree of reliability and availability, but clustering is often cost prohibitive to an organization.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Identifying the Purpose and Characteristics of Fault Tolerance

Create new playlist

Sign In

Sign Up