Chapter 8

Establish Service Levels with Cluster, Fault Tolerance, and Resource Pools

VCP5 Exam Objectives Covered in This Chapter:

  • Creating and Configuring VMware Clusters
    • Determine appropriate failover methodology and required resources for an HA implementation
    • Describe DRS virtual machine entitlement
    • Create/Delete a DRS/HA Cluster
    • Add/Remove ESXi Hosts from a DRS/HA Cluster
    • Add/Remove virtual machines from a DRS/HA Cluster
    • Enable/Disable Host Monitoring
    • Configure admission control for HA and virtual machines
    • Enable/Configure/Disable virtual machine and application monitoring
    • Configure automation levels for DRS and virtual machines
    • Configure migration thresholds for DRS and virtual machines
    • Create VM-Host and VM-VM affinity rules
    • Configure Enhanced vMotion Compatibility
    • Monitor a DRS/HA Cluster
    • Configure Storage DRS
  • Planning and Implementing VMware Fault Tolerance
    • Determine use case for enabling VMware Fault Tolerance on a virtual machine
    • Identify VMware Fault Tolerance requirements
    • Configure VMware Fault Tolerance networking
    • Enable/Disable VMware Fault Tolerance on a virtual machine
    • Test an FT configuration
  • Creating and Administering Resource Pools
    • Describe the Resource Pool hierarchy
    • Define the Expandable Reservation parameter
    • Create/Remove a Resource Pool
    • Configure Resource Pool attributes
    • Add/Remove virtual machines from a Resource Pool
    • Determine Resource Pool requirements for a given vSphere implementation
    • Evaluate appropriate shares, reservations and limits for a Resource Pool based on virtual machine workloads
    • Clone a vApp

This chapter will cover the objectives of sections, 5.1, and 5.2 of the VCP5 exam blueprint. This chapter will focus on clusters, VMware Fault Tolerance (FT), and resource pools.

This chapter will first cover discussing HA implementation resources and failover methodologies. I will cover describing virtual machine entitlement, along with some basic information about DRS and HA. I will cover the steps required to create and delete a DRS/HA cluster, as well as how to add and remove ESXi hosts from a DRS/HA cluster. I will also cover the steps to monitor a DRS/HA cluster. I will show how to enable and disable host monitoring in a cluster and cover how to configure admission control for HA and VMs. Virtual machine and application monitoring will be covered, along with automation levels for DRS and virtual machines. Configuring the migration thresholds for DRS and virtual machines will be covered. I will cover how to create VM-Host and VM-VM affinity rules. EVC compatibility and monitoring clusters will also be covered. This section will conclude with configuring Storage DRS.

The second section of this chapter will cover the VMware Fault Tolerance feature. I will cover determining the use cases for VMware FT and identifying the requirements to implement FT in a vSphere environment. I will also cover how to create networking for the Fault Tolerance logging traffic. I will cover the steps to enable and disable FT and to test it.

The final section of this chapter will focus on resource pools. The resource pool hierarchy will be described, and the Expandable Reservation parameter will be discussed. I will show how to create and remove resource pools, configure their attributes, and add and remove virtual machines to a resource pool. I will also cover how to determine the resource pool requirements for a given vSphere implementation. I will evaluate appropriate shares, reservations, and limits for a resource pool, based on VM workloads. This chapter will conclude with reviewing the procedure for cloning a vApp.

Creating and Configuring VMware Clusters

In vSphere, a cluster is a collection of ESXi hosts and the virtual machines associated with them that have shared resources and are managed by vCenter Server. Clusters are used to enable some of the more powerful features in vSphere, such as DRS, HA, FT, and vMotion. The first topic I will cover in this chapter is determining the appropriate failover methodology and required resources for an HA implementation.

Determining the Appropriate Failover Methodology and Required Resources for an HA Implementation

As you will see later in this chapter, creating clusters and configuring clusters in vCenter Server are both relatively simple tasks. Like many aspects of vSphere, creating a cluster involves proper up-front planning. Part of this planning is determining how you want the cluster to function. Will the cluster have both DRS and HA enabled, or perhaps just one and not the other? The answers to these questions will impact the way the cluster is designed. Remember that DRS provides load balancing and HA provides high availability. While these two features complement each other well, they serve different functions and don't always have to be used in unison.

For clusters that will use HA, determining the resources that will be required is in part determined by how failures in the cluster will be handled. For example, are all virtual machines required to have a certain amount of uptime? If you have a two-node cluster, with each ESXi host running at 80 percent capacity of memory and processing, then a single host failure will not likely allow you to achieve the virtual machine availability requirements. The failover behavior is handled by admission control policies in the HA cluster and will be discussed later in this chapter.

Knowing your environment's specific availability requirements will determine the appropriate failover methodology and help you determine the resources required for an HA implementation. In the next section, I will describe DRS virtual machine entitlement.

Describing DRS Virtual Machine Entitlement

While each ESXi host has its own local scheduler, enabling DRS on a cluster will create a second layer of scheduling architecture. Figure 8.1 shows this architecture.

Figure 8.1 Global and local schedulers

8.1

Both of these schedulers compute resource entitlement for virtual machines. This resource entitlement is based on both a static and dynamic entitlement. The static entitlement consists of a virtual machine's shares, reservations, and limits. The dynamic entitlement for the virtual machine consists of metrics such as estimated active memory and CPU demand.

If the DRS cluster is not overcommitted, then the virtual machine entitlement will be the same as the resource allocation for the virtual machine. In periods of contention, DRS will use the virtual machine entitlement to determine how to best distribute resources.

Now that I have described the virtual machine entitlement, I will cover how to create and delete a cluster.

Creating and Deleting a DRS/HA Cluster

Once the planning and design work is done, creating a cluster is simple. Exercise 8.1 will cover the steps to use the vSphere Client to create a new cluster with HA and DRS enabled.


Exercise 8.1: Creating a New Cluster with HA and DRS Enabled
1. Connect to a vCenter Server with the vSphere Client.
2. Switch to the Hosts and Clusters view. Right-click a datacenter object and choose the New Cluster option from the context menu that appears. The New Cluster Wizard will launch.
3. Provide the cluster with a descriptive and unique name and select both the Turn On vSphere HA and Turn On vSphere DRS options.
c01uf001
4. Click Next.
5. Set Automation Level to Manual and click Next to continue.
6. Leave the Power Management option at its default setting of Off and click Next.
7. Deselect the Enable Host Monitoring option. Change the Admission Control option to Disable. Choosing this option will gray out the Admission Control Policy settings.
c01uf002
8. Click Next.
9. Accept the default settings for the Virtual Machine Options settings and click Next to continue.
10. Accept the default settings for the VM Monitoring settings and click Next to continue.
11. Accept the default settings for VMware EVC and click Next to continue.
12. Accept the default setting for the VM Swapfile Location settings and click Next to continue.
13. Review the information presented on the Ready To Complete screen.
c01uf003
14. Click Finish to create the cluster.
15. A Create Cluster task will begin. When this task completes, verify that the new cluster has been created in the left pane of the Hosts and Clusters view.
16. Right-click the cluster and choose the Edit Settings option that appears in the context menu. The cluster properties window will appear. Look through the available options and then click Cancel when done.
The cluster has been created, and DRS and HA have both been enabled.

B.1

This exercise focused on creating a cluster, and I will revisit all of the options presented in the New Cluster Wizard later in this chapter.

The newly created cluster likely has a warning, because of the lack of shared storage. Clusters can exist without shared storage, but nearly all of the functionality they provide will require shared storage. Before proceeding, please add shared storage to any ESXi hosts that will be added to the cluster you just created. If you need assistance, check Chapter 5 where I showed how to configure shared storage. The remainder of the exercises in this chapter will assume that the same shared storage is available for each ESXi host in the cluster created in Exercise 8.1.

Another configuration that should exist in each ESXi host in the cluster is VMkernel networking for vMotion traffic. Having vMotion configured enables DRS to migrate virtual machines to different hosts. Exercises 4.4 and 4.15 in Chapter 4 covered configuring vMotion networking, if you need assistance with this step. The remainder of the exercises in this chapter will also assume that vMotion has been configured for each ESXi host in the cluster created in Exercise 8.1.

Occasionally, you might need to delete a cluster. The steps to delete a cluster are simple and consist of right-clicking a cluster in the left pane of the vSphere Client and choosing the Remove option from the context menu that appears. A Remove Cluster confirmation dialog will appear, and clicking the Yes button will delete the cluster.

c01uf004

The steps to create and delete a DRS- and HA-enabled cluster have been covered. However, a cluster with no ESXi hosts is not very functional, so in the next section I will cover how to add ESXi hosts to this cluster.

Adding and Removing ESXi Hosts from a DRS/HA Cluster

Like vCenter Server, a cluster isn't nearly as interesting until ESXi hosts have been added to it. In the previous exercise, I created a new cluster with DRS and HA enabled. In Exercise 8.2, two ESXi hosts will be added to this cluster. This exercise will assume that there is one host already present in the same datacenter as the cluster and that the second host will be added to the cluster as a new host.


Exercise 8.2: Adding and Removing ESXi Hosts to and from a Cluster
1. Connect to a vCenter Server with the vSphere Client.
2. Select Hosts And Clusters.
3. In the left pane, locate an ESXi host that will be added to the cluster. As shown here, esxi1.test.local is used.
c01uf005
4. Click this host and drag it into the cluster. The Add Host Wizard will launch.
5. Accept the default option for the virtual machine resources.

Choosing the default option here will put all of the ESXi host's virtual machines in the cluster's root resource pool and will delete any resource pools currently defined on the ESXi host.

6. Click Next and review the information presented on the Ready To Complete screen.
7. Click Finish to add this ESXi host to the cluster.
8. A Move Host Into Cluster task will begin, as will a Configuring vSphere HA task. When these tasks complete, verify that the ESXi host is now a member of the cluster by expanding the cluster in the left pane.

You have now added an existing ESXi host from your datacenter into a new cluster. The remainder of the exercise will cover the steps to add a host that was not already being managed by a vCenter Server.

9. Right-click the cluster and choose the Add Host option from the context menu that appears. The Add Host Wizard will launch.
10. Enter the FQDN of the ESXi host and provide administrative credentials to log in to this host.
c01uf006
11. Click Next to continue. A Security Alert window will appear. Click the Yes button if you trust the ESXi host in your lab environment.
12. Review the information on the Host Summary screen and click Next.
13. If you have licenses and want to use them, assign them on the Assign License screen. Otherwise, use the Evaluation Mode option. Click Next to continue.
14. Leave the Lockdown Mode option unchecked and click Next.
15. As in step 5, choose the default option for the virtual machine resources. Click Next to continue.
16. Review the information presented on the Ready To Complete screen and click Finish to add this ESXi host to the cluster.
17. An Add Host task will begin, as will a Configuring vSphere HA task. When these tasks complete, verify that the ESXi host is now a member of the cluster by expanding the cluster in the left pane.
c01uf007
The ESXi hosts have now been added to the cluster. The remainder of this exercise will cover the steps to delete an ESXi host from the cluster. Just as I added the first ESXi host by dragging it, an ESXi host can be removed from a cluster by dragging it into a new supported location.
18. Select one of the two ESXi hosts that were just added to the cluster and right-click it. Choose the Enter Maintenance Mode option from the context menu that appears.
19. A Confirm Maintenance Mode window will appear. Review this information and click Yes to proceed.
20. An Enter Maintenance Mode task will begin. When this task completes, the icon for the ESXi host will change to indicate that it is in maintenance mode.
21. Click this ESXi host and drag it into the datacenter object.
22. A Move Entities task will begin. When this task completes, verify that the ESXi host is now listed in the datacenter.

Moving forward, you will need to have at least two ESXi hosts in the cluster, so move the ESXi host back into the cluster now. Once it is back in the cluster, exit maintenance mode on the ESXi host.

You have now added two ESXi hosts to the cluster. Any virtual machines that were already present on either of these two ESXi hosts are now automatically part of the cluster. In the next section, I will cover adding and removing new virtual machines to the cluster.

Adding and Removing Virtual Machines to/from a DRS/HA Cluster

The process of adding a virtual machine to a cluster is very similar to the process of adding a virtual machine to a host or vApp. In chapters 6 and 7 I covered adding virtual machines to ESXi hosts and vApps, deploying OVF templates, and moving machines with VMware Converter. Each of these deployment options allows you to choose a cluster, so adding a virtual machine to a cluster should certainly be familiar ground by now.

To create a new VM that will be a member of a cluster, simply right-click the cluster and choose the New Virtual Machine option from the context menu that appears. One difference when using the Create New Virtual Machine Wizard to add a virtual machine to a cluster is that the option to pick a host or cluster is no longer presented in the configuration options. This is shown in Figure 8.2.

Figure 8.2 No host/cluster option for VM

8.2

Virtual machines running on ESXi hosts in the same datacenter but not in the cluster can also be added to the cluster. This could be used for a VM that was in a testing or pilot program but is now ready to be moved into production and benefit from DRS and HA. In Exercise 8.3, the steps to move a powered-on VM from another host to a cluster will be covered. This exercise will require an additional ESXi host that is not part of the cluster and, as mentioned earlier in this chapter, assumes that a vMotion network and shared storage exist for all of the ESXi hosts.


Exercise 8.3: Adding a VM from an Existing ESXi Host to a Cluster
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a powered-on virtual machine. Right-click it and choose the Migrate option from the context menu that appears.
c01uf008
3. The Migrate Virtual Machine Wizard will launch. Choose the Change Host option and click Next.
c01uf009
4. Choose the High Priority Option and click Next.
5. Review the information on the Ready To Complete screen and click Finish to move the VM to the cluster.
6. A Migrate Virtual Machine task will begin. When this task completes, verify that the virtual machine is now a member of the cluster.


B.1

If a vMotion network were not available between the two ESXi hosts used in the previous exercise, the powered-off virtual machine could still be cold migrated to the cluster.

Removing a virtual machine is accomplished in much the same way that a virtual machine is added to a cluster. vMotion can be used to migrate a powered-on VM from a cluster to another host with access to both the vMotion network and the same shared storage. If the vMotion network and shared storage requirements are not met, then the VM can be cold migrated from the cluster to an ESXi host.

You have now created a cluster, added ESXi host(s) to it, and added virtual machines to it. Next I will begin to explore some of the options available in the cluster settings.

Enabling and Disabling Host Monitoring

In Exercise 8.1, you created a DRS/HA cluster, and there were two HA settings that were modified from the default settings. These changes were to disable host monitoring and to disable admission control. Changing these settings initially allowed the cluster to be more flexible for a lab-type environment. Now that the cluster is built and hosts and VMs are running in it, I will configure the HA settings. First I will cover enabling host monitoring.

To begin, right-click the cluster object and choose the Edit Settings option from the context menu that appears. Select vSphere HA in the left pane. Figure 8.3 shows the vSphere HA settings.

The first field is used to enable and disable host monitoring. Enable host monitoring here by selecting the Enable Host Monitoring option. Click OK to save the changes.

Enabling host monitoring will allow ESXi hosts in the cluster to exchange network heartbeats via the HA agents over their management networks. To understand how this works, it is first necessary to discuss how HA works in vSphere 5. HA is completely redesigned in vSphere 5 and now utilizes a master-slave host design. In this design, a single member of the cluster is the master host, while all other hosts are slaves. For network heartbeating, the master host monitors the status of the slave hosts.

If the master host stops receiving network heartbeats from a slave host, it must determine whether the host is failed, isolated, or partitioned. To determine which type of event has occurred, the master host will try to exchange heartbeats with a datastore. This is known as datastore heartbeating and allows the master host to better determine the true state of the slave host(s).

Figure 8.3 vSphere HA options

8.3

If the master host can no longer receive heartbeats via the HA agent on the slave host, the slave host is not responding to pings, and the datastore heartbeats are not occurring from the slave host, then the slave host is considered failed. HA will restart the virtual machines that were running on the slave host.

If a slave host is still running but network heartbeats are not being received via the HA agents, the master host will attempt to ping the cluster isolation address. If this ping operation fails, the slave host is considered isolated. The master host will now utilize the host isolation response to determine what action to take with the virtual machines.

If a slave host is no longer receiving network heartbeats from the master host but is able to communicate with other slave hosts, then the slave host is considered partitioned. An election process will take place among the slave hosts in the partition to determine a new master host. This is considered a degraded protection state but will allow the hosts in the cluster to resume the ability to detect failed hosts or isolated hosts so that the correct HA action can be taken.


B.1

vSphere HA will restart VMs and does not provide stateful application-level fault tolerance.

Host monitoring can be disabled for network or ESXi host maintenance in lab settings or other configurations where you would not want HA to function as normal, such as when first building out the cluster. Now that I have covered host monitoring, I will cover admission control and configuring the admission control policy.

Configuring Admission Control for HA and Virtual Machines

Admission control is used to guarantee that capacity exists in the cluster to handle host failure situations. The current available resources of the cluster are used by admission control to calculate the required capacity, so this value will be dynamic. Placing a host in maintenance mode or experiencing a host failure will change the capacity calculations for the cluster. Admission control attempts to ensure that resources will always be available on the remaining hosts in the cluster to power on the virtual machines that were running on a failed or unavailable host. The recommended configuration for admission control is to enable it. This will allow the cluster to reserve the required capacity and keep you out of host resource saturation situations.

Admission control is further configured by selecting the admission control policy. These policies are used to further define how admission control will ensure capacity for the cluster. The three policies are as follows:

  • Host failures the cluster tolerates
  • Percentage of cluster resources reserved as failover spare capacity
  • Specify failover hosts

Each of these three options will now be discussed in more detail, but also keep in mind that HA is a complex subject and that multiple chapters could easily be devoted to it.

Host Failures the Cluster Tolerates  With this policy, a user-specified number of hosts may fail, and vSphere HA will reserve resources to fail over the virtual machines running from this number of failed hosts. The calculation used for this is based on a slot size, which is the amount of memory and CPU assigned to powered-on virtual machines. The slot size is compared to the capacity of the hosts in the cluster to determine how many total slots are available. vSphere HA will then attempt to reserve enough resources to be able to satisfy the number of needed slots.
Percentage of Cluster Resources Reserved as Failover Spare Capacity  With this policy, a user-specified percentage of the cluster's aggregate CPU and memory resources are reserved for recovery from ESXi host failures. CPU and memory percentages can be configured separately, and the CPU and memory reservation values of the virtual machine are used in the calculation by vSphere HA.
Specify Failover Hosts  With this policy, a user-specified number of hosts are reserved strictly for failover. The failover host(s) cannot have powered-on virtual machines, because the failover host(s) will be used only for an HA event. In an HA event, vSphere HA will attempt to start the virtual machines on the failover host. If the specified failover host is not available or is at capacity, then vSphere HA will attempt to use other hosts in the cluster to start virtual machines.

Now that I have briefly discussed admission control and the admission control policies, I will show how to configure both in Exercise 8.4.


Exercise 8.4: Configuring Admission Control and Admission Control Policies
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select vSphere HA in the left pane in the cluster settings window.
4. Ensure that the Enable Host Monitoring option is selected in the Host Monitoring Status section.
5. In the Admission Control section, ensure that the Enable option is selected.
If admission control was previously disabled, selecting the Enabled option will make the admission control policy section active.
6. Select the admission control policy of Percentage Of Cluster Resources Reserved As Failover Spare Capacity.
7. Change the values to 35% for each.
8. Select the Specify Failover Hosts option. Notice how the previous values stay the same but are now grayed out.
9. Click the blue 0 Hosts Specified link. A Specify Failover Hosts window will appear.
10. Move at least one of your two ESXi hosts from the Available Hosts pane on the left into the Failover Hosts pane on the right by using the arrow buttons located between the two panes.

The final configuration should look like this.

c01uf010
11. Click OK in the Specify Failover Hosts window to save these changes.
12. Note that the blue link has now been updated to indicate that one host is specified for failover.
13. Click OK to save the admission control settings. A Reconfigure Cluster task will begin.
14. When this task completes, locate the vSphere HA panel on the Summary tab for the cluster. Verify that a value of 1 is listed for Current Failover Hosts and click the blue information icon to view the hostname.
c01uf011
15. Power on a virtual machine in the cluster. Attempt to migrate it with vMotion to the failover host. You will receive an error message stating that the operation cannot be performed.

One other item that needs to be addressed is the virtual machine options for vSphere HA. These options are also listed in the cluster settings and are contained in the vSphere HA options. Figure 8.4 shows the Virtual Machine Options screen.

Figure 8.4 Virtual Machine Options screen for HA

8.4

The Virtual Machine Options section of the cluster settings is used to specify the restart priority and host isolation response for both the cluster and the individual virtual machines. The virtual machine restart priority is used to specify the start order for virtual machines, if an HA event occurs. VMs with the highest restart priority are restarted first. This setting can be used to ensure that important virtual machines get powered on first. It is also useful in cases where cluster resources become exhausted in an HA event, to ensure that the more important VMs are powered on.

If you recall the vApp start order options I discussed in Chapter 6, VM restart priority can be used in a somewhat similar way. In an application with a three-tiered architecture, the database server could have a High restart priority, the application server could have a Medium priority, and the web server frontend could have a Low priority. While there will be no guarantees, like with a vApp, it is still a sound approach. There are four settings for virtual machine priority:

  • Disabled
  • Low
  • Medium
  • High

The Disabled option can be used to disable HA for virtual machines. This could be useful for clusters that include nonessential virtual machines.

Host isolation response is used configure the behavior of the ESXi host when it has lost its management network connection but has not failed. When a host is no longer able to communicate with the HA agents running on other ESXi hosts in the cluster and is also unable to ping its isolation address, it is considered isolated. Once isolated, the host will execute the isolation response. The isolation responses are as follows:

  • Leave powered on
  • Power off
  • Shut down

These options are self-explanatory, but do know that the shutdown isolation response requires that the guest operating systems have the VMware Tools installed. Now that I have discussed vSphere HA options for virtual machines, I will show how to configure them in Exercise 8.5.


Exercise 8.5: Configuring VM Options for vSphere HA
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select Virtual Machine Options listed under vSphere HA in the left pane in the cluster settings window.
4. Using the VM Restart Priority drop-down menu, change the option to Low.
5. Using the Host Isolation Response drop-down menu, change the option to Shut Down.
These two options have now changed the default behavior for the cluster as a whole. In the remainder of the exercise, I will show how to modify individual virtual machine settings. The remainder of the exercise will also work off the three-tiered application I discussed earlier. If you don't have a three-tiered application, complete the exercise with existing virtual machines instead.
6. Using the drop-down menu beside the virtual machine that houses the database, change VM Restart Priority to High by clicking the current value. Clicking the current value will present a drop-down menu. Ensure Host Isolation Response is set to the Use Cluster Setting option.

These options will configure HA to shut down the database cleanly and then restart this virtual machine with a high priority. Note that individual virtual machine priority settings will override those of the cluster.

7. Using the drop-down menu beside the virtual machine that houses the middleware, change VM Restart Priority to Medium. Ensure Host Isolation Response is set to the Use Cluster Setting option.

These options will configure HA to shut down the middleware virtual machine and then restart it with a medium priority. Note again that individual virtual machine priority settings will override those of the cluster.

8. Using the drop-down menu beside the virtual machine that houses the web server, change VM Restart Priority to Low. Ensure Host Isolation Response is set to the Use Cluster Setting option.

These options will configure HA to shut down the web server frontend for the application and then restart it with a low priority.

The final configuration should appear similar to what is shown here.
c01uf012
9. Click OK to save these changes.
10. A Reconfigure Cluster task will begin. When this task completes, the virtual machine options for HA will be set.

I have now covered admission control, covered admission control policies, and configured admission control for a cluster with DRS and HA enabled. I also discussed and configured the virtual machine options within HA. Enabling, configuring, and disabling virtual machine and application monitoring will be covered next.

Enabling, Configuring, and Disabling Virtual Machine and Application Monitoring

VM monitoring is used to provide high availability for individual virtual machines. Where vSphere HA can restart virtual machines when a host fails or becomes isolated, VM monitoring can restart individual virtual machines when they have failed or become unresponsive. Application monitoring works in much the same way, except that a specific application is monitored rather than the virtual machine.

VM monitoring works by monitoring VMware Tools heartbeats and I/O activity from the VMware Tools process running in the guest OS. If VMware Tools heartbeats stop for the duration of the failure interval, the last 120 seconds of disk I/O activity will be checked. If there is no disk I/O in this period, the virtual machine will be reset.

Virtual machine monitoring sensitivity can also be configured for the cluster and for individual VMs. This allows you to fine-tune the monitoring sensitivity both to obtain rapid resolution and to avoid false positives. Table 8.1 shows the VM monitoring sensitivity values for the cluster setting.

Table 8.1 VM monitoring sensitivity settings

Setting Failure interval Reset period
High 30 seconds 1 hour
Medium 60 seconds 24 hours
Low 120 seconds 7 days

B.1

You can also use the Custom option for VM Monitoring Sensitivity, if the defaults do not provide the functionality required for your environment.

Virtual machines can be configured individually so that an individual VM can have settings that override those of the cluster. These options are configured in the VM Monitoring section of the cluster settings and will be discussed in more detail in Exercise 8.6.

Application monitoring performs similarly to VM monitoring. It differs in that it uses heartbeats from a specific application and thus requires the application to be customized to utilize VMware application monitoring.

In Exercise 8.6, I will show how to enable and configure VM and application monitoring.


Exercise 8.6: Enabling and Configuring VM Monitoring and Application Monitoring
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select VM Monitoring listed under vSphere HA in the left pane in the cluster settings window.
4. Using the drop-down menu beside VM Monitoring, change the value to VM And Application Monitoring.
Note that by default VM Monitoring is set to Disabled. To disable the VM Monitoring option after this exercise is complete, simply change the value here.
5. Using the slider menu in the center of the screen, change Monitoring Sensitivity to Low.

Note that this will enable the VM to be restarted if no heartbeat or I/O is detected within a two-minute interval. The virtual machine can be restarted up to three times within the reset period of seven days. If the VM fails a fourth time within the reset period, vSphere HA will take no further action. The remainder of this exercise will again make use of the three-tiered application example. Replace these virtual machines with those in your own lab as necessary.

6. Change the VM Monitoring value for the database server to Disabled. Note that the Application Monitoring column will gray out once this selection is made.

The database server has now been excluded from VM and application monitoring.

7. Change the Application Monitoring value for the Middleware server to Custom. A Custom VM Monitoring Settings window will appear, as shown here.
c01uf013
8. Change one value here and then click OK to accept the custom values. Leave the Application Monitoring value at Include.
9. Change the Application Monitoring value for the web server to Exclude and note how the value for VM Monitoring changes automatically to Low.
10. The final configuration should look like this.
c01uf014
11. Click OK to save these changes. A Reconfigure Cluster task will begin. When this task completes, the virtual machine and application monitoring options for HA will be set.

I have now covered enabling and configuring virtual machine and application monitoring for vSphere HA. Next I will move on to DRS and configuring automation levels for DRS and virtual machines.

Configuring Automation Levels for DRS and Virtual Machines

Since DRS is responsible for both the initial placement of virtual machines and migrations using vMotion, automation levels can be configured to help control how involved the distributed resource scheduler will actually be. Table 8.2 shows the available automation levels and a description of each.

Table 8.2 VM monitoring sensitivity settings

Automation level Description
Manual No action will be taken, and vCenter Server will inform of suggested virtual machine migrations.
Partially automated vCenter Server will inform of suggested virtual machine migrations and place the virtual machines on ESXi hosts at VM startup.
Fully automated vCenter Server will use vMotion to optimize resource usage in the cluster and place the virtual machines on ESXi hosts at VM startup.

The automation level can be set for the entire DRS cluster, but virtual machines may have their individual automation levels set to override the cluster settings. In Exercise 8.7, I will set the automation level for a cluster and set a virtual machine's individual automation level to differ from the cluster settings.


Exercise 8.7: Configuring Automation Level for Cluster and a VM
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select vSphere DRS the left pane in the cluster settings window. The automation level will be shown in the right pane.
In Exercise 8.1 when I showed how to create the cluster, the automation level was set to Manual. In the following steps, you will change the automation level to Fully Automated.
c01uf015
4. Select the Fully Automated option and accept the default migration threshold option. Click OK to save these changes.

The migration threshold settings will be covered in detail in the next section of this chapter.

5. A Reconfigure Cluster task will begin. When this task completes, the DRS automation level has been changed for the cluster.

Changing the automation level for the cluster was the first part of this exercise, and the remainder of the exercise will focus on changing the automation level for an individual virtual machine in the cluster.

6. Obtain the cluster settings again and choose the Virtual Machine Options item listed under vSphere DRS in the left pane.
7. Ensure that Enable Individual Virtual Machine Automation Levels is selected. Removing this check mark from the check box will disable individual automation levels and gray out the virtual machine options below.
8. Select a virtual machine from the list and click the value listed beside it in the Automation Level column.
9. In the drop-down menu that appears, change Automation Level to Disabled for this virtual machine.

Disabling the automation level will prevent vCenter from making or performing migration recommendations for it. Disabling the automation level is also known as pinning a virtual machine to a host.

The final configuration should look like this.

c01uf016
10. Click OK to save these changes.

A Reconfigure Cluster task will begin. When this task completes, the DRS automation level has been changed for the virtual machine.



B.1

Individual automation levels of virtual machines in a DRS cluster can be overridden by features such as vApps and/or FT.

I have now covered configuring the automation level for a DRS cluster and individual virtual machines that are running in the cluster. In the next section, I will configure migration thresholds for DRS and virtual machines.

Configuring Migration Thresholds for DRS and Virtual Machines

In the previous exercise of configuring the cluster automation level, the default migration threshold was accepted. The migration threshold is used to specify which recommendations are generated or applied, depending on the selected cluster automation level. For example, the manual and partially automated automation levels will result only in vMotion recommendations being generated. The migration threshold can be adjusted using the slider provided in the DRS automation-level settings.

Moving the migration threshold slider to the left will make DRS more conservative or minimize the number of recommendations or operations performed by DRS. Moving the slider to the right will make DRS more aggressive and will result in more recommendations or operations in the cluster. Like many of the options in vSphere, the key is to find the migration threshold setting that works best in your particular environment. In Exercise 8.8, I will cover the steps for configuring the migration threshold for DRS.


Exercise 8.8: Configuring the Migration Threshold for DRS
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select vSphere DRS the left pane in the cluster settings window. The automation level will be shown in the right pane.
4. In Exercise 8.7 the automation level was set to Fully Automated, and the default setting was accepted for the migration threshold.
5. Slide the Migration Threshold slider to the far left. Review the information displayed below the slider. This information will change each time the slider is moved to explain the effect of the current position.
6. Move the slider one position to the right and review the information displayed below it. Repeat these steps until the slider has reached the far-right side.
7. Move the slider to its default position in the middle and click the Cancel button.

The migration threshold is applied to the cluster as a whole, and there is no option to change the migration threshold for an individual virtual machine. The closest setting that can be used to exclude virtual machines from the migration threshold setting is the individual virtual machine automation level. As I covered in the previous exercise, the individual virtual machine automation level can be either changed or disabled.


B.1

Disabling the individual automation level of a virtual machine that vCenter Server is running on is often used to ensure that vCenter Server is always located, or “pinned,” on a single ESXi host.

Now that I have covered setting the migration threshold for a DRS cluster, I will cover how to create VM-Host and VM-VM affinity rules.

Creating VM-Host and VM-VM Affinity Rules

Affinity rules are used in clusters to control the placement of virtual machines. Two types of relationships can be established with affinity rules:

Affinity  Used to keep VMs together
Anti-affinity  Used to keep VMs separated

For example, an affinity rule can be used to ensure that two virtual machines run on the same ESXi host. This is often used for performance reasons, because all of traffic between virtual machines will be localized. An anti-affinity rule might be used when there are redundant virtual machines established as part of a fault-tolerant design. Keeping these VMs separated could provide protection from unplanned application downtime in the event of an ESXi host failure.

In addition to the two types of relationships established with affinity rules, there are two different types of affinity rules:

VM-Host  Used with a group of VMs and a group of hosts
VM-VM  Used between individual virtual machines

The key thing to remember with the two types of affinity rules are that the VM-Host rules apply to groups and they will utilize the DRS Groups Manager. VM-VM rules apply to individual virtual machines and do not utilize the DRS Groups Manager.


B.1

Affinity rules in DRS are not the same thing as the CPU scheduling affinity that can be specified for a VM in the Virtual Machine Properties editor.

Now that I have discussed what affinity rules are and the relationships that can be established with them, Exercise 8.9 will cover the steps required to create a VM-Host affinity rule. As mentioned previously, VM-Host affinity rules are used to group virtual machines and hosts. Both of these groups must be created as DRS groups before any VM-Host affinity rules can be created.


Exercise 8.9: Creating a VM-Host Affinity Rule
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select DRS Groups Manager from beneath the vSphere DRS option in the left pane in the cluster settings window.
4. In the Virtual Machines DRS Groups section, click the Add button. A DRS Group window will appear.
c01uf017
5. Give the DRS group a descriptive name in the Name field.

The left pane lists the virtual machines that are not in this DRS group. This should be all virtual machines in the cluster, since there are currently no DRS groups created. It is important to remember that a virtual machine may be in more than one DRS group.

6. Move one or more virtual machines from the left pane into the right pane using the arrow buttons located between the panes.

The right pane represents virtual machines that will be contained in the DRS group.

7. Once all of the applicable virtual machines have been moved into the right pane, the result should look like this.
c01uf018
8. Click OK and verify that the virtual machine DRS group is now listed.
9. In the Host DRS Groups section, click the Add button. A DRS Group window will appear. This window is identical to the one used in steps 5 to 8 but will contain ESXi hosts instead of virtual machines.
10. Give the DRS group a descriptive name and add a single ESXi host to the group. The configuration should look like this.
c01uf019
11. Click OK and verify that both a Virtual Machine DRS Group and a Host DRS Group are each listed in their respective sections in the cluster settings window.
12. Click OK to save these groups. A Reconfigure Cluster task will begin.

At this point, the DRS groups required to create a VM-Host affinity rule have been created. The remainder of the exercise will cover creating the actual affinity rule.

13. Return to the cluster properties window and select Rules from beneath the vSphere DRS option in the left pane in the cluster settings window.
14. Click the Add button in the lower-right pane to create a new rule. A Rule window will appear.

Note that the Rule tab is the default tab in the Rule window. Also note that the DRS Groups Manager can be accessed from the DRS Groups Manager tab. This is provided as a convenient option.

15. Provide the rule with a descriptive name.
16. Using the drop-down menu located under the Type field, choose the Virtual Machines To Hosts option.
17. Note that a DRS Groups section appears after making this change.

Three options are available here, and these three components are what actually make up an affinity rule.

18. Using the drop-down menu located under the Cluster VM Group field, verify that the VM group created earlier in this exercise is selected.
19. Using the drop-down menu located under the Cluster Host Group field, verify that the Host group created earlier in this exercise is selected.

Both of these drop-down menus should contain only a single entry, since only a single VM group and host group were created. If multiple groups had been created, the drop-down menus would contain them all.

20. In the unlabeled drop-down menu located between the Cluster VM Group and Host Group menus, choose the Must Run On Hosts In Group option.

The final configuration should look like this.

c01uf020
21. Click OK to add the rule.
22. The rule will now be listed in the cluster settings window. Expand the rule to view its components.
c01uf021
23. Click the Details button at the bottom of the screen and review the information presented in the Details window. Click the Close button to exit the Details window.
24. Click the Edit button at the bottom of the screen and notice that the Rule window will appear. This is the same window that was used to create the rule, but the fields are prepopulated with the name and components of the rule.
25. Click Cancel to close the Rule window.

Note that there is a check box listed to the left of the rule name. Removing the check mark from this check box will disable the rule. This can be useful for troubleshooting purposes and will prevent you from having to delete and re-create individual rules.

26. Click OK in the cluster settings window. A Reconfigure Cluster task will begin. When this task completes, the VM-Host affinity rule will become active.


B.1

Virtual machines that are removed from a cluster will lose their DRS group affiliations, and returning the virtual machine to the cluster will not restore them.

In step 20 of the previous exercise, four options are available when creating the VM-Host affinity rule. These options are as follows:

Must Run On Hosts In Group  VMs in the specified VM group are required to run on ESXi hosts in the specified host group.
Should Run On Hosts In Group  VMs in the specified VM group are preferred to run on ESXi hosts in the specified host group.
Must Not Run On Hosts In Group  VMs in the specified VM group are required to never run on ESXi hosts in the specified host group.
Should Not Run On Hosts In Group  VMs in the specified VM group are preferred to not run on ESXi hosts in the specified host group.

There are also a few caveats that need to be mentioned about VM-Host affinity rules:

  • If multiple VM-Host affinity rules exist, they are applied equally.
  • VM-Host affinity rules are not checked for compatibility with each other.
  • DRS and HA will not violate affinity rules, so affinity rules could actually affect cluster functionality.

The best practice is to use VM-Host affinity rules sparingly and to consider using the preferential options in rules. This allows more flexibility.

Now that I have covered VM-Host affinity rules, I will move on to VM-VM affinity rules. Where VM-Host affinity rules are used to specify relationships between VM groups and host groups, a VM-VM affinity rule applies only to individual virtual machines. In Exercise 8.10, I will cover the steps for creating a VM-VM affinity rule.


Exercise 8.10: Creating a VM-VM Affinity Rule
1. Connect to a vCenter Server with the vSphere Client.
2. Locate a cluster in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
3. Select Rules from beneath the vSphere DRS option in the left pane in the cluster settings window.
4. In the right pane, you will see the rule created in the previous exercise. Remove the check box from this rule, before proceeding, to disable the rule.
5. Click the Add button at the bottom of the screen to create a new VM-VM anti-affinity rule.
6. A Rule window will appear. Provide the rule with a descriptive name.
7. Using the drop-down menu located under the Type field, choose the Separate Virtual Machines option.
8. Click the Add button at the bottom of the Rule window to add the virtual machines to this rule.
9. Select two virtual machines that should not run on the same host. The final configuration should look like this.
c01uf022
The example image assumes that two web servers are being used to provide application redundancy, and the VM-VM anti-affinity rule is used to keep them on different hosts.
10. Click OK to add the rule.
11. The rule will now be listed in the Cluster Settings window. Expand the rule to view its components.
12. Select the rule and click the Details button at the bottom of the screen. Review the information presented in the Details window. Click the Close button to exit the Details window.
13. Click the Edit button at the bottom of the screen and notice that the Rule window will appear. This is the same window that was used to create the rule, but the fields are prepopulated with the name and components of the rule.
14. Click Cancel to close the Rule window.
15. Note that VM-VM rules can also be disabled by removing the check mark from the check box beside the rule name.
16. Click OK in the cluster settings window. A Reconfigure Cluster task will begin. When this task completes, the VM-VM anti-affinity rule will become active.

Just like the VM-Host affinity rules, there is a caveat for VM-VM affinity rules. If VM-VM affinity rules conflict with each other, the newer of the conflicting rules will be disabled. For example, in the previous exercise, I covered how to create an anti-affinity rule. If an affinity rule were to be added with the same two virtual machines, the result would look similar to what is shown in Figure 8.5.

Figure 8.5 Conflicting VM-VM affinity rules

8.5

In Figure 8.5, a new rule was added with the name Conflicting Rule that attempted to keep the two web server virtual machines together. Also note that DRS places higher priority on preventing violations of anti-affinity rules than it does on preventing violations of affinity rules. Now that I have covered VM-VM affinity rules, I will cover how to configure Enhanced vMotion Compatibility for a cluster.

Configuring Enhanced vMotion Compatibility

Enhanced vMotion Compatibility (EVC) can be used in a cluster to allow greater vMotion compatibility for the different ESXi hosts in the cluster. Configuring EVC for a cluster allows the ESXi host processors to present a baseline processor feature set known as the EVC mode. The EVC mode will be equal to the host in the cluster that contains the smallest feature set.


B.1

Only processor features that affect vMotion will be masked by EVC. It has no effect on processor speeds or vCPU counts.

Enabling EVC for a cluster is a simple operation, but it is important to know that the following requirements exist for enabling EVC on a cluster:

  • All hosts in the cluster must have only Intel or only AMD processors. Mixing Intel and AMD processors is not allowed.
  • ESX/ESXi 3.5 update 2 or newer is required for all hosts in the cluster.
  • All hosts in the cluster must be connected to the vCenter Server that is used to manage the cluster.
  • vMotion networking should be configured identically for all hosts in the cluster.
  • CPU features, like hardware virtualization support (AMD-V or Intel VT) and AMD No eXecute (NX) or Intel eXecute Disable (XD), should be enabled consistently across all hosts in the cluster.

In Exercise 8.11, the steps to configure EVC on a cluster will be covered. In the interest of covering all lab environments, I will create a new cluster with no ESXi hosts for this exercise. This way, if you are using nested ESXi in your lab, you can complete the exercise. It will also allow me to cover how to enable EVC when the cluster is created.


Exercise 8.11: Enabling EVC for a Cluster
1. Connect to a vCenter Server with the vSphere Client.
2. Switch to the Hosts and Cluster view.
3. Select a datacenter object in the inventory and right-click it. Choose the New Cluster option from the context menu that appears.
4. Give the cluster a descriptive and unique name. Do not enable DRS or HA for the cluster. Click Next to continue.
5. Select the Enable EVC For Intel Hosts option. Accept the default value in the drop-down menu of Intel “Merom” Gen. (Xeon Core 2) and click Next to continue.
c01uf023
6. Accept the default value for Swapfile Policy For Virtual Machines and click Next to continue.
7. Review the information presented on the Ready To Complete screen and click Finish to create the EVC-enabled cluster.

At this point, a cluster has been created with EVC enabled. The remainder of this exercise will focus on the steps required to change the EVC mode.

8. Locate the cluster you just created in the inventory. Right-click it and choose the Edit Settings option from the context menu that appears.
9. Select VMware EVC from the left pane. Review the current VMware EVC Mode status at the top of the information presented in the right pane.
10. In the lower right of the screen, click the Change EVC Mode button.
11. The Change EVC Mode window will appear.
12. Using the drop-down menu to the right of VMware EVC Mode, change the selection to Intel Sandy Bridge Generation. The Compatibility window should report Validation Succeeded, since there are no ESXi hosts in this cluster.
13. Click OK in the Change EVC Mode window and then verify that the VMware EVC Mode reports Intel Sandy Bridge Generation in the cluster settings window.
14. Click OK to save the changes. A Reconfigure Cluster task and a Configure Cluster EVC task will both start. When these tasks complete, the EVC mode has been changed successfully.

In the previous exercise, there were no ESXi hosts in the cluster. This allowed flexibility in creating and changing the EVC mode. In the real world, where clusters will have hosts, it is important to understand how EVC mode impacts these hosts.

Lowering the EVC mode for a cluster involves moving from a greater feature set to a lower feature set. This is often useful when introducing ESXi hosts on newer hardware into an existing cluster. It is important to remember that any virtual machines running on ESXi hosts with newer features than the EVC mode supports will need to be powered off, before lowering the EVC mode.

Raising the EVC mode for a cluster involves moving from a lower feature set to a greater feature set. This is often useful when hardware refreshes of ESXi hosts have raised the CPU baseline capability. It is important to remember that any running virtual machines may continue to run during this operation. The VMs simply will not have access to the newer CPU features of the EVC mode until they have been powered off. Also note that a reboot will not suffice, and a full power cycle of the virtual machine is required.

The following two VMware KB articles are helpful for determining both EVC compatibility and processor support:

Now that I have covered configuring EVC mode for a cluster, I will cover how to monitor a DRS/HA cluster.

Monitoring a DRS/HA Cluster

There are many options for monitoring a DRS cluster, and having the vSphere Client open is a great start. If there are significant problems, the cluster item in the inventory will display an alert or warning icon. A warning condition is shown for a cluster in Figure 8.6.

Figure 8.6 Cluster with warning condition

8.6

In Figure 8.6 an ESXi host in the cluster was abruptly powered off to simulate a host failure. The cluster went into a warning status, and the ESXi host is listed with an alert status. This real-time information can be quite valuable.

A great deal of additional real-time information can be obtained about a cluster by simply viewing its Summary tab in the vSphere Client. Five panels are available on the Summary tab, and much of the information contained here is the same information found in the cluster settings properties window. Figure 8.7 shows the five panels available for viewing in the cluster Summary tab.

Figure 8.7 Cluster's Summary tab

8.7

The General panel shows the status of DRS and HA and the current EVC mode of the cluster. There is also information on the CPU, memory, and storage resources available to the cluster. Inventory information about hosts, processors, datastores, VMs, and vMotion operations are also available in the General panel.

Located under the General panel is the Commands panel, which provides convenient access to many of the same options available in the context menu for the cluster.

The vSphere HA panel is located on the top right and contains information on admission control and admission control policies. Host, VM, and application monitoring information are also available in this panel. At the bottom of the vSphere HA panel, there are two blue links provided for the cluster status and configuration issues. Figure 8.8 shows the vSphere HA Cluster Status window.

Figure 8.8 vSphere HA Cluster Status window

8.8

The vSphere HA Cluster Status window defaults to the Hosts tab, where the master node is identified and the number of slave nodes connected to the master node are listed. The VMs tab provides information on the number of VMs protected and unprotected by vSphere HA. The Heartbeat Datastores tab provides information on the datastores used for heartbeating.

The other blue link provided in the vSphere HA panel is to identify configuration issues in the cluster. Clicking this link will open the Cluster Configuration Issues window, where any problems with the cluster will be listed.

The next panel is the vSphere DRS panel, which lists the automation level, DRS recommendations and faults, migration threshold setting, and load standard deviation information. There are also two blue links included in the vSphere DRS panel. The lowest blue link titled View DRS Troubleshooting Guide will launch the help file. The other blue link titled View Resource Distribution Chart is one of the more useful tools for monitoring your cluster's resource consumption. Figure 8.9 shows the DRS Resource Distribution window.

Figure 8.9 DRS Resource Distribution window

8.9

The default view in the DRS Resource Distribution window is for CPU resources. Using the gray % and MHz buttons at the top of the window will allow the view to be toggled between utilization views. Each ESXi host in the cluster will be listed in the left column, and each of the colored boxes represents either a single virtual machine or a group of what are essentially idle virtual machines. Green boxes are good to see here! The legend at the bottom of the window shows that green means 100 percent of the entitled resources are being delivered for the VM. Any other color means the VM is not receiving all of its entitled resources. By hovering the cursor over any of these colored boxes, you can obtain the name of the virtual machine and information about its current resource usage.

The memory settings for the cluster are also available in the DRS Resource Distribution window, and this view can be selected by clicking the gray Memory button at the top of the window. Figure 8.10 shows the memory view.

Figure 8.10 DRS memory utilization view

8.10

Using the gray % and MB buttons at the top of the window will allow the view to be toggled between utilization views. Each ESXi host in the cluster will be listed in the left column, and each of the gray boxes represents a single virtual machine. Just like with the CPU resources, hovering the cursor over any of these gray boxes will allow you to obtain the name of the virtual machine and information about its current resource usage.

As you can see, the information presented in the DRS Resource Distribution window provides a quick and easy way to see whether your hosts are load balanced and can often be revealing about which VMs are using the most resources. The information presented here also allows you to view more closely how DRS actually load balances.

Another option that can be used to monitor the cluster is the DRS tab that is visible when the cluster is selected in the left pane. The DRS tab default view of Recommendations lists the DRS properties and recommendations, if configured with either the partially automated or manual automation level. The Faults view can be used to view faults that prevented the application of a DRS recommendation. The final view is the History view, which can be used to review historical information for DRS actions.

The Resource Allocation tab, Performance tab, Tasks & Events tab, and Alarms tab can each also be used to monitor a cluster. Alarms can also be configured in vCenter Server to help monitor your cluster. Figure 8.11 shows the default vSphere HA alarm definitions.

Figure 8.11 vCenter alarms for vSphere HA

8.11

vCenter Server alarms will be covered in detail in Chapter 11, and many of the monitoring topics will also be revisited in Chapter 10 when troubleshooting HA/DRS clusters will be covered.

In addition to the included functionality in vCenter Server, there are additional options like VMware vCenter Operations or any number of third-party solutions that can be used to monitor your clusters. These products can provide additional insight into your virtual infrastructure and are often already deployed in many environments. Operational staff members are also typically trained in using these solutions. Leveraging these existing monitoring solutions can add additional monitoring capabilities for your DRS/HA clusters.

Now that I have covered monitoring a cluster, I will cover how to configure Storage DRS.

Configuring Storage DRS

Storage DRS is a new feature in vSphere 5. Storage DRS offers to datastores what a DRS-enabled cluster offers to ESXi hosts. When a virtual machine is deployed, it can be deployed into a cluster, and DRS will take care of the initial placement of the VM on an ESXi host. DRS can also move the virtual machine to a different host, as necessary, in order to provide the VM its entitled resources. Storage DRS provides both virtual machine placement and load balancing based on I/O and/or capacity. The goal of Storage DRS is to lessen the administrative effort involved with managing datastores by representing a pool of storage as a single resource.


 

B.1

Goodbye to Spreadsheets
A virtual infrastructure administrator has multiple datastores in her infrastructure. Her environment has datastores located on a SAN, but none of these datastores is consistent in their configuration. There is one disk group with fifteen SATA drives, one disk group with six 15k FC drives, and many other groups that vary in number, capacity, and drive speed. Until now, the virtual infrastructure administrator has kept spreadsheets to keep up with these disk configurations and virtual machine placements.
She also spends a lot of time dealing with complaints of slowness, identifying the latencies, and manually correcting them with Storage vMotion. Of course, after this, she has to update the spreadsheets to help her make sense of it all.
After the virtual infrastructure administrator upgrades her environment to vSphere 5, she decides that she will implement datastore clusters and use Storage DRS. Storage DRS will monitor her environment for capacity and I/O performance issues and correct them automatically. The virtual infrastructure administrator was relieved to be able to say goodbye to both her manual processes and the spreadsheets.

Storage DRS is made possible by the new datastore cluster object, which is simply a collection of datastores with shared resources and management. There are several requirements to use datastore clusters:

  • Only ESXi 5 hosts can be attached to any of the datastores in a datastore cluster.
  • Mixing NFS and VMFS datastores is not allowed in the same datastore cluster.
  • A datastore cluster cannot contain datastores shared across multiple datacenters.
  • VMware recommends as a best practice that datastores with hardware acceleration enabled not be used with datastores that do not have hardware acceleration enabled.

Configuring Storage DRS starts with creating a datastore cluster. Exercise 8.12 covers the steps to create a datastore cluster and configure Storage DRS.


Exercise 8.12: Configuring Storage DRS
1. Connect to a vCenter Server with the vSphere Client.
2. Switch to the Datastores and Datastore Clusters view.
3. Right-click a datacenter object in the left pane and choose the New Datastore Cluster option from the context menu that appears.
4. The New Datastore Cluster Wizard will launch.
5. Provide a unique and descriptive name in the Datastore Cluster Name field. Ensure that the Turn On Storage DRS option is selected.
6. Click Next to continue.
7. Choose the default Automation Level option of No Automation (Manual Mode).
The Storage DRS automation level is very similar to the automation level setting in DRS. The obvious exception is that there are only two settings available with Storage DRS. Manual is used when no automation is desired, and virtual machine placement and load balancing migration recommendations will only be suggested by vCenter Server.
8. Click Next to continue.

The following image can be used as a reference for steps 9 to 14 of this exercise.

c01uf024
9. In Storage DRS Runtime Rules, ensure that the Enable I/O Metric For SDRS Recommendations option is selected.

Enabling this option will allow vCenter Server to consider I/O metrics when making Storage DRS recommendations or automated migrations. In other words, this option enables I/O load balancing for the datastore cluster.

10. Leave the default values selected for Storage DRS Thresholds.

The Storage DRS thresholds are similar to the migration threshold setting used in DRS. A percentage of space utilization and a millisecond value of I/O latency can each be configured to trigger Storage DRS to make a recommendation or take an automated action.

11. Click the blue Show Advanced Options link.
12. Leave the slider at the default value of 5% for the No Recommendations Until Utilization Difference Between Source And Destination Is option.

This option is configured to ensure that a capacity-based recommendation is worthwhile. In other words, if the source datastore is 94 percent full and the target is 90 percent full, then don't make the move. The difference in these 2 percentages is the value 4. The default value of 5 percent would not allow this move to occur.

13. Using the drop-down menu, change the Check Imbalances Every value to a different value.

This setting is used to determine the frequency that Storage DRS will check capacity and load.

14. Leave the I/O Imbalance Threshold slider at its default value.

This setting is also similar to the migration threshold used in DRS. It is used to configure the amount of I/O imbalance that Storage DRS should tolerate.

15. Click Next.
16. On the Select Hosts and Clusters screen, select a cluster to add the datastore cluster to. Click Next to continue.
17. Select datastores to add to the datastore cluster, keeping in mind the requirements listed before the exercise.
c01uf025
18. Click Next to continue.
19. Review the information presented on the Ready To Complete Screen and click Finish to create the datastore cluster.
20. A Create A Datastore Cluster task will begin. Also, a Move Datastores Into A Datastore Cluster task will begin, and a Configure Storage DRS task will begin. When these tasks complete, verify that the datastore cluster is listed in the left pane. Expand it to view the datastores it contains.
21. Select the datastore cluster in the left pane and then select the Summary tab in the right pane.
22. The datastore cluster is now created, and existing virtual machines can be migrated to it with Storage vMotion.

You have now created a datastore cluster and enabled Storage DRS on it. This concludes the first section of this chapter on creating and configuring VMware clusters. In the next section, I will cover VMware Fault Tolerance.

Planning and Implementing VMware Fault Tolerance

As a VMware Certified Professional, you will be expected to know when and how to use VMware Fault Tolerance (FT). FT is used to provide higher levels of virtual machine availability than what is possible with vSphere HA. VMware FT uses VMware vLockstep technology to provide a replica virtual machine running on a different ESXi host. In the event of an ESXi host failure, the replica virtual machine will become active with the entire state of the virtual machine preserved. In this section, I will cover use cases and requirements for VMware FT, as well as how to configure it.

Determining Use Cases for Enabling VMware Fault Tolerance on a Virtual Machine

VMware FT can provide very high availability for virtual machines, and it is important to understand which applications are candidates for using VMware FT. There are several use cases for VMware FT:

  • Applications that require high availability, particularly applications that have long-lasting client connections that would be reset by a virtual machine restart.
  • Applications that have no native capability for clustering.
  • Applications that could be clustered but clustering solutions want to be avoided because their administrative and operational complexities.
  • Applications that require protection for critical processes to complete. This is known as on-demand fault tolerance.

 

B.1

On-Demand Fault Tolerance
A manufacturing company has an application that was developed in-house that is used four times a year. This application is used to provide specific quarter-end reports to the finance department. There is one report in particular that is notorious for taking many hours to complete. Recently the physical server housing this application had a motherboard failure while the report was running. As a result of this failure and the time required to repair it, the report was significantly delayed. Finance was able to complete their work on time, but many of the staff had to work through the night to make it happen.
A meeting was called between various members of IT and the business to discuss a solution for this problem. Specifically, the finance department did not want a hardware failure to delay them like this again. The virtual infrastructure administrator was present and suggested that the server be converted to a virtual machine and protected with VMware FT on an on-demand basis. This would allow the virtual machine to run as a normal virtual machine and gain the benefit of being protected with HA for normal day-to-day operations. During the four times a year that the key reports are run, the virtual infrastructure administrator enables FT for this virtual machine. The virtual machine is now protected from a physical server failure and consumes the extra resources required to provide this protection only four times a year.

It is important to remember that VMware FT will not protect virtual machines from guest OS and/or application failures. If either the guest operating system or the applications running in the guest OS fail, then the secondary VM will fail identically. It is also important to note that VMware FT has both resource and licensing implications. If the primary VM uses 2GB of RAM, the secondary VM will also use 2GB of RAM, and both of these RAM allocations will count toward the vRAM total. Now that I have covered the use cases for enabling VMware FT on virtual machines, I will identify some of the requirements for using it.

Identifying VMware Fault Tolerance Requirements

The actual number of requirements to use VMware FT is rather large, and for the VCP exam it would be unreasonable to expect you to know all of them. For this section, only the requirements specifically listed in the vSphere Availability Guide have been included. There are many requirements for using VMware FT at the cluster, host, and virtual machine levels. For the cluster, these requirements include the following:

  • Host certificate checking must be enabled in the vCenter Server settings.
  • A minimum of two FT-certified ESXi hosts with the same FT version or host build number must be used.
  • The ESXi hosts in the cluster must have access to the same datastores and networks.
  • The ESXi hosts must have both Fault Tolerance logging and vMotion networking configured.
  • vSphere HA must be enabled on the cluster.

In addition to the cluster requirements, the ESXi hosts have their own set of requirements:

  • The ESXi hosts must have processors from an FT-compatible processor group.
  • Enterprise or Enterprise Plus licensing must be in place.
  • ESXi hosts must be certified for FT in the VMware HCL.
  • ESXi hosts must have hardware virtualization (HV) enabled in the BIOS.

B.1

For information on processors and guest operating systems that are supported with VMware FT, refer to the following VMware KB article:

There are also requirements for the virtual machines that will be used with VMware FT:

  • Eager zeroed thick provisioned virtual disks and RDMs in virtual compatibility mode must be used in the virtual machine.
  • Virtual machines must be on shared storage.
  • The guest OS installed on the virtual machine must be on the list of supported operating systems that can be used with VMware FT.

You should also note that only virtual machines with a single vCPU are compatible with Fault Tolerance. vSMP is not supported. Unsupported devices, such as USB devices, parallel ports, or serial ports, cannot be attached to the virtual machine; also, incompatible features such as snapshots, Storage vMotion, and linked clones must not be used on virtual machines that will be protected with VMware FT.


B.1

The VMware FT requirements listed previously are not all-inclusive, and the requirements and limitations could easily consume an entire chapter. For a comprehensive and constantly updated list of the requirements and limitations of VMware FT, check my blog at http://communities.vmware.com/blogs/vmroyale/2009/05/18/vmware-fault-tolerance-requirements-and-limitations.

Now that I have covered the requirements to use VMware FT, I will move on to configuring networking for the fault tolerance logging traffic.

Configuring VMware Fault Tolerance Networking

To use VMware FT, there are two networking requirements that must be met. The first of these requirements is a vMotion network to be used by ESXi hosts in the cluster. vMotion is required, because the secondary VM is initially created by a vMotion of the primary VM to a different ESXi host in the cluster. Because of this design, it is also recommended to have separate 1GbE NICs for vMotion and fault tolerance logging traffic.

The fault tolerance logging traffic is the second network requirement for VMware FT. This is also a VMkernel connection type that is used to move all nondeterministic events from the primary VM to the secondary VM. Nondeterministic events include network and user input, asynchronous disk I/O, and CPU timer events. This is the connection that is used to keep the primary and secondary virtual machines in lockstep.

In Exercise 8.13, I will create a baseline networking setup that will include both vMotion and fault tolerance logging networking. This exercise will use a standard vSwitch and will require two available NICs in each ESXi host in the cluster. If you already have a vSwitch created for vMotion, you can omit the sections of this exercise that pertain to creating the vMotion network.


Exercise 8.13: Configuring VMware FT Logging Traffic
1. Connect to a vCenter Server with the vSphere Client.
2. Choose an ESXi host that is a member of a cluster and select the Configuration tab for this ESXi host.
3. Click the blue Networking link in the Hardware panel.
4. Click the blue Add Networking link.
5. When the Add Network Wizard launches, select the VMkernel connection type and click Next to continue.
6. Select an available vmnic on the correct network segment and click Next to continue.
7. Provide a unique and consistent network label for the port group. Enter a VLAN ID if necessary, and select the Use This Port Group For vMotion option.
c01uf026
8. Click Next to continue.
9. Provide a static IP address and appropriate subnet mask to the VMkernel that will be used for vMotion traffic and click Next to continue.
10. Review the information on the Ready To Complete screen and click Finish to create the new vSwitch and the vMotion port group.
11. An Update Network Configuration task will begin, and a Select Virtual NIC task will also begin. When these tasks complete, verify that the new vSwitch is listed on the Configuration tab.

At this point, the new vSwitch contains a single port group that will be used for vMotion traffic. In the next part of this exercise, another NIC will be added to this vSwitch.

12. On the Configuration tab, click the blue Properties tab for the vSwitch that was just created.
13. The vSwitch Properties window will appear with the default view of the Ports tab. Select the Network Adapters tab.
14. Click the Add button located in the lower-left corner.
15. The Add Adapter Wizard will launch. Select a check box for an unclaimed adapter. Click Next to continue.
16. For Policy Failover Order, ensure that both adapters are placed in the Active Adapters list at the bottom of the screen. It should look like this.
c01uf027
17. Click Next to continue.
18. Review the information presented on the Summary screen, and then click Finish to add the vmnic.
19. An Update Virtual Switch task will begin. When this task completes, verify that there are now two vmnics listed on the Network Adapters tab.

You have now added a second NIC to your vSwitch. In the next part of this exercise, I will show you how to add the fault tolerance logging port group to the vSwitch.

20. Click the Ports tab in the vSwitch Properties window.
21. Click the Add button located in the lower-left corner.
22. When the Add Network Wizard launches, again select the VMkernel connection type and click Next to continue.
23. Provide a unique and consistent network label for the port group. Enter a VLAN ID, if necessary, and select the Use This Port Group For Fault Tolerance Logging option.
c01uf028
24. Click Next to continue.
25. Provide a static IP address and appropriate subnet mask to the VMkernel that will be used for fault tolerance logging traffic and click Next to continue.
26. Review the information on the Ready To Complete screen and click Finish to create the new vSwitch and the fault tolerance logging port group.
27. An Update Network Configuration task will begin, and a Select Virtual NIC task will also begin. When these tasks complete, verify that the new port group is listed on the Ports tab of the vSwitch Properties window.

You have created a vSwitch with two physical uplinks and two port groups. The final configuration of the vSwitch will separate the port groups onto different vmnics. This will ensure that vMotion uses one physical uplink and that fault tolerance logging will use the other physical uplink. In the event that a single physical uplink associated with this vSwitch fails, this configuration will provide fault tolerance.

28. On the Ports tab of the vSwitch Properties window, select the vSwitch item listed at the top of the list.
29. On the right side, in the Failover And Load Balancing section, locate the Active Adapters information. Verify that both network adapters that were added to the vSwitch are listed and that Standby Adapters has a value of None.
30. Select the vMotion port group in the left pane. On the right side, review the information for Active Adapters and Standby Adapters. This information matches the vSwitch properties and lists two active adapters.
31. Select the FTlog port group in the left pane and verify that it too has the same active and standby adapters listed as the vSwitch and the vMotion port group.
32. Select the vMotion port group in the left pane and click the Edit button at the bottom of the window.
33. The Port Group Properties window will appear. Review the information presented on the General tab, and then select the NIC Teaming tab.
34. In the Failover Order option, select the Override Switch Failover Order option. This will activate the lower portion of the window.
35. Select the lower numbered vmnic and then click the Move Down button until this vmnic is listed under the Standby Adapters section. The final configuration should look like this.
c01uf029
36. Click OK to save these changes. A Reconfigure Port group task will start. When this task completes, make sure that the vMotion port group is selected in the left pane.
37. On the right side, in the Failover And Load Balancing section, locate the Active Adapters information. Verify that both the higher-numbered vmnic is listed as an active adapter and that the lower-numbered vmnic is listed as a standby adapter.
38. Select the FTlog port group in the left pane, and click the Edit button at the bottom of the window.
39. The Port Group Properties window will appear. Review the information presented on the General tab and then select the NIC Teaming tab.
40. For the Failover Order option, select the Override Switch Failover Order option. This will activate the lower portion of the window.
41. Select the higher-numbered vmnic and then click the Move Down button until this vmnic is listed under the Standby Adapters section. The final configuration should look like this.
c01uf030
42. Click OK to save these changes. A Reconfigure Port group task will start. When this task completes, make sure that the FTlog port group is selected in the left pane.
43. On the right side, in the Failover And Load Balancing section, locate the Active Adapters information. Verify that both the lower-numbered vmnic is listed as an active adapter and that the higher-numbered vmnic is listed as a standby adapter.
44. Click Close in the vSwitch Properties window and review the vSwitch settings shown on the Configuration tab.
45. Select the Summary tab. Review the information presented in the General pane. vMotion Enabled and Host Configured For FT should both show a value of Yes.
c01uf031
46. Repeat this exercise for each ESXi host in the cluster.



B.1

Fault tolerance logging traffic is unencrypted and contains guest network data, guest storage I/O data, and the guest's memory contents. Because of this, fault tolerance logging traffic should always be isolated.

I have now covered configuring the network for fault tolerance logging to be used with VMware FT. Next I will show how to enable and disable VMware FT on a virtual machine.

Enabling and Disabling VMware Fault Tolerance on a Virtual Machine

Once all of the VMware FT prerequisites are met, the actual process of enabling FT for a virtual machine is incredibly simple. Exercise 8.14 covers the steps to enable FT for a virtual machine.


Exercise 8.14: Enabling FT for a Powered-Off Virtual Machine
1. Connect to a vCenter Server with the vSphere Client. The vSphere Web Client cannot be used for this operation.
2. Locate a powered-off virtual machine that belongs to a cluster with ESXi hosts that have been configured for FT.
3. Right-click the virtual machine and choose the Fault Tolerance ⇒ Turn On Fault Tolerance option from the context menu that appears.
Depending on the disk configuration of the virtual machine, you will next be presented with one of two dialogs. Each of these dialogs contains the same information, but one of them presents additional information about the virtual machine's disks. This additional information would be presented if the virtual disks are not in the thick provisioned eager zeroed format. The dialog with the additional disk information is shown here.
c01uf032
4. Review the information presented in the Turn On Fault Tolerance dialog and click the Yes button to continue.

If the virtual disks were not in the thick provisioned eager zeroed format, the disks will need to be converted to the proper format before FT can be enabled. This disk conversion operation cannot be performed if the virtual machine is powered on. If the virtual disks in the virtual machine were already in the thick provisioned eager zeroed format, then the disk information would not have been present in the Turn On Fault Tolerance dialog.

5. A Turn On Fault Tolerance task will begin. When this task completes, take note of the icon for the FT-protected VM in the left pane. It has now changed to a darker shade of blue.
6. Power on the FT-protected virtual machine.
7. Select the Summary tab for the FT-protected virtual machine. Locate the Fault Tolerance pane and verify the information presented there. This information should look like this.
c01uf033
8. Select the cluster that the FT-protected virtual machine is running in and then select the Virtual Machines tab.
9. Notice that the FT-protected virtual machine is listed and that there is also a secondary virtual machine listed.
c01uf034

Now that the steps to protect a VM with FT have been covered, I will cover the steps to disable FT for a VM that has been protected with it. Virtual machines protected with FT can have FT either disabled or turned off. Turning off FT for a VM will delete the secondary VM and all historical performance data. The virtual machine's DRS automation level will also be set at the cluster default settings. This option is used when FT will no longer be used for a virtual machine. Examples of this would be when a virtual machine has had its SLA modified or is due for scheduled maintenance and a snapshot is desired as part of the process.

Disabling FT for a VM will preserve the secondary VM, the configuration, and all historical performance data. Disabling FT would be used if VMware FT might be used again in the future for this virtual machine. An example of this would be when using on-demand fault tolerance for a virtual machine. Exercise 8.15 covers the steps to disable FT for a virtual machine that is currently protected with it.


Exercise 8.15: Disabling FT for a Powered-Off Virtual Machine
1. Connect to a vCenter Server with the vSphere Client. The vSphere Web Client cannot be used for this operation.
2. Locate a powered-off virtual machine that belongs to a cluster with ESXi hosts that have been configured for FT.
3. Right-click the virtual machine and choose the Fault Tolerance ⇒ Disable Fault Tolerance option from the context menu that appears.
4. A Disable Fault Tolerance dialog box will appear. Review the information presented here, and click the Yes button to continue.
5. A Disable Fault Tolerance task will begin. When this task completes, verify the fault tolerance status in the Fault Tolerance pane of the virtual machine's Summary tab.
6. Also notice that a Warning icon has been placed over the virtual machine. Click the Alarms tab to view the warning information.
c01uf035
7. Right-click the triggered alarm and choose the Acknowledge Alarm option from the context menu that appears. Right-click the triggered alarm again and choose the Clear option from the context menu that appears.
8. In the left pane, note that the warning icon has now been removed from the virtual machine but that the VM icon still maintains the darker blue color.

At this point, the VM is no longer protected by FT. If there was a time where FT protection was again required, the following steps could be used to enable FT on the virtual machine.

9. Right-click this same virtual machine and choose the Fault Tolerance ⇒ Enable Fault Tolerance option from the context menu that appears.
10. An Enable Fault Tolerance task will begin. When this task completes, verify the Fault Tolerance Status in the Fault Tolerance pane of the virtual machine's Summary tab.

Now that I have covered enabling and disabling VMware FT, I will cover the steps required to test an FT configuration.

Testing an FT Configuration

Now that FT has been configured and a virtual machine is being protected by it, the only remaining item is verifying that FT works as expected. The only way to know whether FT will work as expected is to test failover using the built-in functions in vCenter Server or to manually fail a host.

Testing via manually failing a host is easily accomplished. If your ESXi hosts are physical servers, then you could simply pull the power cable on the host that the FT primary VM is running on. If your ESXi hosts are virtual machines, then simply power off the ESXi host that the FT primary VM is running on. Either one of these approaches will guarantee an ESXi host failure. If you have many running virtual machines or simply aren't comfortable powering off your ESXi host this way, then you can also use the FT Test Failover functionality from the Fault Tolerance menu in the vSphere Client. This testing approach is preferred, since it is both fully supported and noninvasive. Exercise 8.16 will cover the steps to test your FT configuration.


Exercise 8.16: Testing Failover of FT
1. Connect to a vCenter Server with the vSphere Client. The vSphere Web Client cannot be used for this operation.
2. Locate a powered-off virtual machine that belongs to a cluster with ESXi hosts that have been configured for FT.
3. Select the Summary tab for the virtual machine and review the information presented in the Fault Tolerance pane. Take note of the fault tolerance status and the Secondary Location values. An example is shown here.
c01uf036
4. Right-click the virtual machine and choose the Fault Tolerance ⇒ Test Failover option from the context menu that appears.
5. A Test Failover task will begin, and an alert icon will appear on the VM in the left pane. The Fault Tolerance pane will display results similar to what is shown here.
c01uf037
6. When the secondary VM has again been restarted, the alert icon will be removed from the virtual machine in the left pane. The Fault Tolerance pane will display results similar to what are shown in the following image.
c01uf038
Note that the VM has changed ESXi hosts in this testing.

Now that I have covered testing an FT configuration, the VMware FT coverage is complete. In the next section of this chapter, I will cover creating and administering resource pools.

Creating and Administering Resource Pools

As a VMware Certified Professional, resource pools are a topic you should be very familiar with. Resource pools are used to partition the CPU and memory resources of ESXi hosts. They offer a convenient way to separate resources along requirements or political boundaries and also offer a way to control the resource usage of multiple virtual machines at once. This offers significant advantages over setting individual virtual machine limits, reservations, and shares. In this section, I will cover how resource pools work and how to configure and use them.

Describing the Resource Pool Hierarchy

Each ESXi host or DRS-enabled cluster has a hidden root resource pool. This root resource pool is the basis for any hierarchy of shared resources that exist on stand-alone hosts or in DRS-enabled clusters. In Figure 8.12, the CLUSTER object represents the root resource pool.

Figure 8.12 Resource pool hierarchy

8.12

The root resource pool is hidden, since the resources of the ESXi host or cluster are consistent. Resource pools can contain child resource pools, vApps, virtual machines, or a combination of these objects. This allows for the creation of a hierarchy of shared resources. Objects created at the same level are called siblings. In Figure 8.12, RP-Finance and RP-Legal are siblings. Fin-VM1 and Fin-VM2 are also siblings.

When creating child resource pools below an existing resource pool, the resource pool at a higher level is called a parent resource pool. In Figure 8.12, RP-Legal is a parent resource pool for the child resource pool of RP-Legal-TEST.

Each resource pool can have shares, limits, and reservations specified, in addition to specifying whether the reservation is expandable. The Expandable Reservation parameter will now be defined.


B.1

You will typically not want to use resource pools to organize your virtual machines. Also use caution with resource pools, because each additional child resource pool added will make the environment increasingly more difficult to understand and manage properly.

Defining the Expandable Reservation Parameter

The Expandable Reservation parameter can be used to allow a child resource pool to request resources from its parent or ancestors if the child resource pool does not have the required resources. This allows greater flexibility when creating child resource pools. The expandable reservation is best shown in action, so Exercise 8.17 will demonstrate how the expandable reservation works.


Exercise 8.17: Configuring and Testing Expandable Reservations
1. Connect to a vCenter Server with the vSphere Client.
2. Select a cluster from the inventory and right-click it. Choose the New Resource Pool option from the context menu that appears.
3. When the Create Resource Pool window appears, enter the name Parent Resource Pool and use the slider to set CPU Reservation to 1000MHz.
4. Deselect the Expandable Reservation setting in the CPU Resources section. The final configuration of the CPU Resources should look like this.
c01uf039
5. Accept the defaults for all other settings and click OK to create the resource pool.
6. A Create Resource Pool task will begin. When this task completes, verify that the new resource pool is listed under the cluster.

The parent resource pool has been created with a 1000MHz reservation, and the Expandable Reservation parameter was not selected. This setting creates a parent resource pool that has a static 1000MHz of CPU resources. In the following steps, a child resource pool will be created.

7. Right-click the Parent Resource Pool resource pool created in steps 2 to 5. Choose the New Resource Pool option from the context menu that appears.
8. When the Create Resource Pool window appears, enter the name Child Resource Pool and use the slider to set CPU Reservation to 500MHz.
9. Deselect the Expandable Reservation setting in the CPU Resources section. The final configuration of the CPU Resources should look like this.
c01uf040
10. Accept the defaults for all other settings, and click OK to create the resource pool.
11. A Create Resource Pool task will begin. When this task completes, verify the new resource pool is listed under the first resource pool.

The child resource pool has been created with a 500MHz reservation, and the Expandable Reservation parameter was not selected. This setting creates a resource pool that has a static 500MHz of CPU resources. In the following steps, a VM will be created and powered on.

12. Right-click the Child Resource Pool resource pool and choose the New Virtual Machine option from the context menu that appears.
13. Create any virtual machine configuration you like in the Create New Virtual Machine Wizard. When the VM has been created, right-click it and choose the Edit Settings option from the context menu that appears.
14. Click the Resources tab and select the CPU item in the left pane. Using the slider, configure the VM to have a CPU Reservation setting of 750MHz.
c01uf041
15. Click OK and power up the virtual machine once the Reconfigure Virtual Machine task completes.

You should be presented with the error shown here.

c01uf042
Since this virtual machine has a reservation of 750MHz and the resource pool it belongs to has a reservation of only 500MHz, resource pool admission control prevents the virtual machine from being powered on. In the next set of steps, I will cover how to configure the Child Resource Pool to use the Expandable Reservation option.
16. Right-click the Child Resource Pool object and choose the Edit Settings option from the context menu that appears.
17. Select the Expandable Reservation option in the CPU Resources section and click OK.
18. An Update Resource Pool Configuration task will begin. When this task completes, power on the same virtual machine that generated an error in step 15.

You should now be presented with a running virtual machine, as shown here.

c01uf043

The expandable reservation allowed Child Resource Pool to pull the required resources from Parent Resource Pool to satisfy the CPU reservation of the virtual machine.

Now that I have defined and discussed the Expandable Reservation parameter, I will cover the steps required to create a resource pool using the vSphere Web Client.

Creating and Removing a Resource Pool

As shown in Exercise 8.17, the actual task of creating resource pools is very simple. Options like the expandable reservation illustrate that the more difficult task is in understanding how resource pools work and how they will be used in your environment. Exercise 8.18 details the steps to create a resource pool using the vSphere Web Client.


Exercise 8.18: Creating a Resource Pool
1. Open a web browser and connect to the FQDN of the vCenter Server that will be used for this exercise.
2. Click the blue Log In To vSphere Web Client link.
3. When the vSphere Web Client loads, enter your credentials.
4. Right-click a cluster in the left pane and choose the Inventory ⇒ New Resource Pool option from the context menu that appears.
5. Give the resource pool the name RP-Legal and use the drop-down menu to change the Shares value to High for both the CPU and memory.
6. Ensure that the Reservation Type field's Expandable box is selected. The final configuration should look exactly like this.
c01uf044
7. Click OK to create the resource pool. A Create Resource Pool task will begin. When this task completes, the resource pool is ready to be used.
8. Right-click the same cluster in the left pane and choose the Inventory ⇒ New Resource Pool option from the context menu that appears.
9. Give the resource pool the name RP-Finance and accept the defaults for all of the settings.
10. Click OK to create the resource pool. A Create Resource Pool task will begin. When this task completes, the resource pool is ready to be used

In this exercise, two resource pools were created. Assume that the cluster used in this exercise had a combined 6GHz of CPU and 30GB of RAM. In periods of resource contention, the RP-Legal resource pool will receive 4GHz of CPU and 20GB of memory, while the RP-Finance resource pool will receive 2GHz of CPU and 10GB of memory. In periods of no resource contention, the expandable reservation allows either resource pool to have more resources.

The task to delete a resource pool is also very simple. Keep in mind the implications of deleting a resource pool when working with child resource pools. Any virtual machines that are in the resource pool will be moved automatically to the parent resource pool or the root resource pool. To delete a resource pool, right-click it in the vSphere Web Client and choose the Inventory ⇒ Remove option from the context menu. You will be prompted to remove the resource pool, as shown in Figure 8.13.

Figure 8.13 Remove Resource Pool prompt

8.13

Click the Yes button to remove the resource pool. A Delete Resource Pool task will begin. When this task completes, verify that any virtual machines in the resource pool were moved as expected.

Now that I have covered creating and removing resource pools, I will next cover how to configure resource pool attributes.

Configuring Resource Pool Attributes

Resource pools can be modified after their creation by configuring their attributes. This is useful in situations where the resource pool requirements have changed. The shares, reservation, and limit can each be modified for the resource pool. Shares, reservations, and limits were discussed in Chapter 7, but they will be reviewed again.

Resource pool shares can be specified with respect to the total resources of the parent resource pool. Sibling resource pools will share the parent's resource, based on their specified share values. Virtual machines in resource pools with the highest share values will be able to consume more resources in periods of resource contention on the ESXi host.

In addition to shares, reservations can be used to guarantee a minimum allocation of CPU and memory for the resource pool. This setting is used to claim a specific amount of the resource for the virtual machine so that these resources will always be available. Memory reservations can also be used to avoid overcommitment of physical memory resources.

Limits are used to set an upper bound for memory and CPU resources. This prevents a virtual machine in the resource pool from using more resources than specified. This setting is by default set to Unlimited for both CPU and memory. Using this setting will ensure that the virtual machine uses close to the vCPU and memory allocations it has been granted.

To edit a resource pool in the vSphere Client, right-click it and choose the Edit Settings option that appears in the context menu. The resource pool attributes will be shown and should look similar to what is shown in Figure 8.14.

Figure 8.14 Resource pool attributes

8.14

I have now covered how to configure resource pool attributes and will now cover how to add and remove virtual machines to/from a resource pool.

Adding and Removing Virtual Machines to/from a Resource Pool

Objects that can be added to a resource pool include other resource pools, vApps, and virtual machines. There are multiple options that can be used to add virtual machines to a resource pool. In Exercise 8.17, a virtual machine was added to a resource pool by right-clicking the resource pool and selecting the New Virtual Machine option from the context menu.

Virtual machines can also be added to resource pools, while powered on or off, by dragging and dropping them in the vSphere Client. Many virtual machine operations and tasks will also allow you to choose the resource pool as part of the operation. These operations include the following:

  • Creating a new virtual machine at the host, cluster, or datacenter level
  • Migrating a virtual machine using vMotion or cold migration
  • Deploying OVF templates
  • Cloning a virtual machine
  • Deploying a VM from a template
  • P2V conversions with VMware Converter

There are several caveats that must be covered for adding virtual machines to a resource pool:

  • If the virtual machine is powered on and the resource pool does not have adequate resources to guarantee its reservations, admission control will not allow the move to complete.
  • Virtual machine–configured reservations and limits will not change.
  • Virtual machine shares of high, medium, or low will be adjusted to reflect the total number of shares in the new resource pool.
  • Virtual machine shares configured with a custom value will retain the custom value. A warning will appear if a significant change in total share percentage would occur.

The operations that can be used to remove a virtual machine from a resource pool are similar to the operations for adding a virtual machine to a resource pool. Virtual machines can be dragged and dropped out of resource pools, while powered on or off, using the vSphere Client. vMotion, cold migrations, removing a virtual machine from inventory, and deleting a virtual machine are additional ways to remove it from a resource pool.


B.1

Virtual machine operations will often provide the ability to select a resource pool, but this option will appear only if and when the resource pools are already configured.

Now that the operations for adding and removing virtual machines to resource pools have been covered, I will cover how to determine the resource pool requirements for a given vSphere implementation.

Determining Resource Pool Requirements for a Given vSphere Implementation

Determining the resource pool requirements for a given vSphere implementation involves knowing or predicting what your environment will require for resources. The requirements will depend on a variety of factors:

  • Knowing the characteristics and requirements of workloads or planned workloads
  • Knowing the specific terms of SLAs or other agreements that dictate performance
  • Knowing whether the resources will be divided along workload, business, or even political boundaries
  • Knowing whether the applications would benefit from expandable reservations in resource pools
  • Knowing who will own and administer the resource pools
  • Knowing whether child resource pools will be created and used
  • Knowing whether resource pools or the workloads running in them would benefit from using reservations or limits
  • Knowing whether VMware DRS will be used and the vSphere editions (Enterprise and Enterprise Plus) required
  • Knowing whether VMware HA will be used
  • Knowing whether VMware FT will be used

This is not an all-inclusive list, and each implementation will be different. The key is to know the workloads, the infrastructure layout, licensing, and how the business or organization works. Political factors or budget control may determine the design in many cases, regardless of the technical factors.

Now that I have covered determining the resource pool requirements for a given vSphere implementation, I will move on to evaluating appropriate shares, reservations, and limits for a resource pool based on virtual machine workloads.

Evaluating Appropriate Shares, Reservations, and Limits for a Resource Pool Based on Virtual Machine Workloads

Much like determining the resource pool requirements, evaluating the appropriate settings really means knowing your workloads. In some cases, the workloads themselves may change or have new requirements. In Exercise 8.19, I will evaluate memory reservation settings for a virtual machine that has a new requirement of being protected with VMware FT.


Exercise 8.19: Evaluating Memory Reservations for a FT VM
1. Connect to a vCenter Server with the vSphere Client.
2. Select a cluster from the inventory and right-click it. Choose the New Resource Pool option from the context menu that appears.
3. When the Create Resource Pool window appears, give it a unique name. Accept the defaults for CPU Resources.
4. Using the slider, set Memory Reservation to 1024MB.
5. Deselect the Expandable Reservation setting in the Memory Resources section.
c01uf045
6. Click OK to create the resource pool. A Create Resource Pool task will begin.

You have now created the new resource pool that will be used for this exercise. The next step is to add a virtual machine to it.

7. Move an existing powered-off virtual machine that is configured with a memory size of 1024MB into the resource pool or create a new virtual machine with a memory size of 1024MB. Ensure that a guest OS that is supported for use with VMware FT is used.
8. In the Virtual Machine Properties editor, select the Resources tab. Configure the virtual machine to have a memory reservation of 1024MB.
c01uf046
You have now added a virtual machine to the resource pool, and the virtual machine's memory reservation equals that of the resource pool.
9. Power on the virtual machine. You should receive an Insufficient Resources error message like the one shown here.
c01uf047
This error has occurred because the resource pool has a 1024MB static memory reservation. The resource pool is not allowed to borrow resources from the root resource pool, and this virtual machine is configured to use 1024MB of RAM. To be able to power on this virtual machine, the resource pool must have the required memory resources. This task has failed, because the virtual machine memory overhead has not been allocated in the resource pool. Each virtual machine has a memory overhead that is based on the number of vCPUs and memory in the VM. In the previous image, you can see the Memory Overhead value of 67.18MB listed. In the next steps, you will configure the resource pool to have the required resources to power on the virtual machine.
10. Select the virtual machine and obtain the Memory Overhead value from the General pane in the Summary tab. Round this number to the nearest whole number.
11. Right-click the resource pool and choose the Edit Settings option from the context menu that appears. Add the rounded virtual machine memory overhead value from the previous step to the current Reservation value in the Memory Resources section.
12. Click OK to save the changes. An Update Resource Pool Configuration task will begin. When this task completes, power on the virtual machine.
13. When the guest OS has finished loading, select the resource pool in the left pane. Click the Resource Allocation tab and review the information presented in the Memory section in the upper portion of the screen.
c01uf048
14. Shut down the virtual machine. Review the information on the resource pool Resource Allocation tab again.

You have now created a resource pool, added a virtual machine to it, and configured the reservation of the resource pool to allow the virtual machine to be powered on. Next FT will be enabled for this virtual machine.

15. Right-click the powered-off virtual machine and choose the Fault Tolerance ⇒ Turn On Fault Tolerance option.
16. Review the information presented on the Turn On Fault Tolerance screen and click Yes.
17. A Turn On Fault Tolerance task will begin. When this task completes, power on the virtual machine. The same Insufficient Resources error message received in step 9 will appear.

Enabling FT failed, because creating the secondary VM requires the same number of resources from the resource pool as the primary VM requires. The primary virtual machine was powered on, but the secondary was unable to be powered on because the lack of available resources. In the next steps of this exercise, I will adjust the resource pool memory reservation to account for this requirement.

18. Right-click the resource pool and choose the Edit Settings option from the context menu that appears. Take the current Reservation value in the Memory Resources section and double it. Enter this new value in the Reservation field.
c01uf049
19. Click OK to save these changes. An Update Resource Pool Configuration task will begin. When this task completes, enable FT for the virtual machine again.
20. The same Insufficient Resources error message received in steps 9 and 17 will appear one more time.

The insufficient resources error appeared again, because VMware FT has an additional overhead that ranges from 5 to 20 percent depending on the workload. You will now adjust the resource pool reservation to account for this 5 to 20 percent overhead. Note that adding 20 percent should guarantee success in the final steps of this exercise.

21. Right-click the resource pool and choose the Edit Settings option from the context menu that appears. Take the current Reservation value in the Memory Resources section and add 20 percent to it. Round this value to the nearest whole number, if necessary.
c01uf050
22. Click OK to save these changes. An Update Resource Pool Configuration task will begin. When this task completes, enable FT for the virtual machine again.
23. When the Start Fault Tolerance Secondary VM task completes, select the resource pool in the left pane. Click the Resource Allocation tab and review the information presented in the Memory section in the upper portion of the screen.
c01uf051
24. The 20 percent value was likely too high, and you can review the value shown under Available Reservation to determine this.

This exercise could be considered a review of the entire chapter, but its main purpose was to cover the steps to evaluate appropriate reservations for a resource pool based on a virtual machine workload. Next I will briefly review cloning a vApp.

Cloning a vApp

Cloning a vApp was discussed in Chapter 6, and Exercise 6.19 covered the steps necessary to clone a vApp using the Clone vApp Wizard. As a reminder, Figure 8.15 shows the Ready To Complete screen of this wizard.

Figure 8.15 Clone vApp Wizard

8.15

This concludes this chapter on clusters, FT, and resource pools.

Summary

The first part of this chapter focused on creating and configuring VMware clusters. This chapter began with determining the appropriate failover methodology and required resources for an HA implementation. DRS virtual machine entitlement was described, and I showed how to create and delete a cluster. I covered adding and removing ESXi hosts to/from a cluster, along with adding and removing virtual machines to/from the cluster. How to enable and disable host monitoring and how to configure admission control were covered. I covered virtual machine and application monitoring. Configuring automation levels for DRS and virtual machines was covered, along with configuring migration thresholds for DRS and individual virtual machines. I created VM-Host and VM-VM affinity rules. I discussed EVC and the steps to configure it. Monitoring a DRS/HA cluster was covered, and the first section concluded with configuring Storage DRS.

The second part of this chapter focused on planning and implementing VMware Fault Tolerance. Determining the use case for enabling VMware Fault Tolerance on a virtual machine was discussed first. I then moved on to identifying VMware Fault Tolerance requirements. Configuring VMware Fault Tolerance logging networking was covered. I enabled and disabled VMware Fault Tolerance on a virtual machine and concluded this section with testing FT configurations.

The final part of this chapter focused on creating and administering resource pools. This section began with describing the resource pool hierarchy. The Expandable Reservation parameter was discussed. Creating and removing resource pools were both covered, along with configuring resource pool attributes. I added and removed virtual machines to a resource pool and discussed determining the resource pool requirements for a given vSphere implementation. I covered evaluating appropriate shares, reservations, and limits for a resource pool based on virtual machine workloads, and I concluded this chapter with a review of cloning a vApp.

Exam Essentials

Know how to create and configure VMware clusters.  Be able to determine the appropriate failover methodology and required resources for an HA implementation. Know how to describe DRS virtual machine entitlement. Understand how to create and delete a DRS/HA cluster. Be able to add and remove ESXi hosts and virtual machines from a DRS/HA cluster. Know how to enable and disable host monitoring. Be able to configure admission control for HA and virtual machines. Know how to enable, configure, and disable virtual machine and application monitoring. Understand how to configure automation levels and migration thresholds for DRS and virtual machines. Know how to create VM-Host and VM-VM affinity rules. Be able to configure EVC. Understand the different ways that a DRS/HA cluster can be monitored. Be able to configure Storage DRS.
Know how to plan and implement VMware FT.  Be able to determine the use case for enabling VMware Fault Tolerance on a virtual machine. Know the requirements and limitations of VMware FT. Be able to configure fault tolerance logging networking. Understand how to enable and disable FT for a virtual machine. Know the different ways to test an FT configuration.
Know how to create and administer resource pools.  Be able to describe the resource pool hierarchy. Understand the Expandable Reservation parameter. Be able to create and remove resource pools. Know how to configure resource pool attributes. Understand the different ways to add and remove virtual machines from a resource pool. Be able to determine resource pool requirements for a given vSphere implementation. Know how to evaluate appropriate shares, reservations, and limits for a resource pool based on virtual machine workloads. Be able to clone a vApp.

Review Questions

1. Which of the following are ESXi host requirements for VMware FT? (Choose all that apply.)

A. Enterprise or Enterprise Plus licensing must be in place.

B. ESXi hosts must be certified for FT in the VMware HCL.

C. ESXi hosts must have hardware Virtualization (HV) enabled in the BIOS.

D. ESXi hosts must have EVC mode enabled.

2. Which of the following are true statements about Storage DRS? (Choose two.)

A. ESXi 4.1 and newer hosts are required.

B. ESXi 5 and newer hosts are required.

C. Mixing NFS and VMFS datastores is not allowed.

D. Mixing NFS and VMFS datastores is allowed.

3. What condition must be first met to remove an ESXi host from a cluster?

A. The host must have host monitoring disabled.

B. The host must be in maintenance mode.

C. The host must be disconnected from vCenter Server.

D. None of these.

4. Which of the following are considered best practices for setting up the fault tolerance logging network? (Choose two.)

A. Single shared 1GbE NIC for vMotion and fault tolerance logging traffic

B. Single dedicated 1GbE NIC for fault tolerance logging traffic only

C. Isolating the fault tolerance logging traffic

D. Routing the fault tolerance logging traffic

5. A virtual machine has its host isolation response set to Shut Down, but this virtual machine does not have the VMware Tools installed. What will happen to this virtual machine, if the ESXi host it is running on becomes isolated?

A. It will shut down.

B. Nothing.

C. It will be powered off.

D. It will be suspended.

6. You need to create an affinity rule to require a set of virtual machines to run on a specific ESXi host. Which of the following do you need to create?

A. VM-Host affinity rule

B. VM-Host anti-affinity rule

C. VM-VM affinity rule

D. VM-VM anti-affinity rule

7. When implementing VMware FT, what is the overhead percentage that is required?

A. 5 to 10 percent

B. 10 percent

C. 5 to 20 percent

D. 20 percent

8. Which of the following schedulers exist in a DRS-enabled cluster? (Choose two.)

A. Priority scheduler

B. Global scheduler

C. Entitlement scheduler

D. Local scheduler

9. Which of the following statements best describes the Expandable Reservation parameter?

A. The Expandable Reservation parameter can be used to allow a child resource pool to request resources from its parent.

B. The Expandable Reservation parameter can be used to allow a child resource pool to request resources from its parent or ancestors.

C. The Expandable Reservation parameter can be used to allow a parent resource pool to request resources from its child.

D. The Expandable Reservation parameter can be used to allow a parent resource pool to request resources from a sibling.

10. When raising the EVC mode for the cluster, which of the following statements is true? (Choose two.)

A. Raising the EVC mode for cluster involves moving from a greater feature set to a lower feature set.

B. Raising the EVC mode for cluster involves moving from a lower feature set to a greater feature set.

C. Running virtual machines will need to be powered off during this operation.

D. Running virtual machines may continue to run during this operation.

11. When using vMotion to migrate a virtual machine, the option to select a resource pool was not available for the destination. What could be a reason for this?

A. The VM has an individual memory reservation set.

B. vMotion does not allow this operation.

C. Changing resource pools is not allowed.

D. No resource pools exist in the destination.

12. In which of the following automation levels will vCenter Server inform of suggested virtual machine migrations and place the virtual machines on ESXi hosts at VM startup?

A. Manual

B. Partially automated

C. Fully automated

D. None of these

13. Which of the following admission control policies will result in an ESXi host in the cluster that is unable to run virtual machines until a failover situation occurs?

A. Host failures the cluster tolerates

B. Percentage of cluster resources reserved as failover spare capacity

C. Specify failover hosts

D. None of these

14. Which of the following are configurable resource pool attributes? (Choose all that apply.)

A. Shares

B. Reservation

C. Priority

D. Name

15. A master host has stopped receiving heartbeats from a slave host. What are the possible conditions that the slave host could be in? (Choose all that apply.)

A. Failed

B. Unprotected

C. Isolated

D. Partitioned

16. Which of the following can be used to enable and disable VMware FT for a virtual machine that contains a single eager zeroed thick provisioned disk? (Choose all that apply.)

A. The vSphere Client for the powered-on virtual machine

B. The vSphere Client for the powered-off virtual machine

C. The vSphere Web Client for the powered-on virtual machine

D. The vSphere Web Client for the powered-off virtual machine

17. You need to test the FT configuration in your environment. Which of the following approaches is both supported and noninvasive?

A. Pull the power cables from an ESXi host that is running VMs with FT enabled.

B. Use the vSphere Client and right-click the secondary virtual machine. Choose the Delete From Disk option.

C. Put an ESXi host with FT VMs running on it in maintenance mode.

D. Use the vSphere Client and right-click a virtual machine that has FT enabled on it. Choose the Fault Tolerance Test Failover option from the context menu that appears.

18. You want DRS to use the most aggressive setting possible for the migration threshold. How do you accomplish this?

A. Move the slider for the automation level to the far left in the DRS settings.

B. Move the slider for the migration threshold to the far left in the DRS settings.

C. Move the slider for the automation level to the far right in the DRS settings.

D. Move the slider for the migration threshold to the far right in the DRS settings.

19. Which of the following is a use case for VMware FT? (Choose all that apply.)

A. Application that requires high availability

B. Application that has no native capability for clustering

C. Application that requires protection for critical processes to complete

D. Application that has persistent and long-standing connections

20. Which of the following options can be used to restart individual virtual machines when they have failed or become unresponsive?

A. VMware FT

B. VM monitoring

C. Application monitoring

D. None of these

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.94.152