Chapter 22. Performing Basic Troubleshooting of Virtual Machine Operations

This chapter covers the following subjects:

Image Troubleshooting Virtual Machine Resource Contention Issues

Image Troubleshooting Storage Overcommitment Issues

Image Troubleshooting iSCSI Software Initiator Configuration Issues

Image Fault Tolerant VM Network Latency Issues

Image Troubleshooting VMware Tools Installation Issues

Image Troubleshooting VM States

Image Virtual Machine Constraints

Image Identifying the Root Cause of a Storage Issue

Image Identifying Common Virtual Machine Boot Disk Errors

Image Setting and Managing Settings in the vSphere Web Client

Image Correcting Common Warnings and Alerts within the vSphere Web Client

This chapter discusses performing basic troubleshooting on your VMs. In addition, you learn how to troubleshoot common issues with FT latency, VM states, and VM constraints. The chapter also covers identifying the root cause of a storage issue and identifying and correcting common boot issues. Finally, you learn about managing in the vSphere Web Client, including “real world” information in regard to settings, properties, and warnings and alerts. This understanding will help you troubleshoot your own systems and is essential to successfully navigate the troubleshooting questions on the exam.

“Do I Know This Already?” Quiz

The “Do I Know This Already?” quiz allows you to assess whether you should read this entire chapter or simply jump to the “Exam Preparation Tasks” section for review. If you are in doubt, read the entire chapter. Table 22-1 outlines the major headings in this chapter and the corresponding “Do I Know This Already?” quiz questions. You can find the answers in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Chapter Review Questions.”

Image

Table 22-1 “Do I Know This Already?” Section-to-Question Mapping

1. Which of the following can have a negative effect on virtual machine performance? (Choose two.)

a. Excessive SCSI reservations

b. Multipathing

c. Path thrashing

d. Adequate LUN queue depth

2. Which of the following could cause network latency issues on FT-enabled VMs? (Choose two.)

a. Hosts with properly configured 1-Gbps vmnics

b. Hosts with improperly configured 10-Gbps vmnics

c. VMs with vmxnet3 vnics

d. Hosts with properly configured 10-Gbps vmnics

3. You are attempting to install VMware tools without logging on to the VM first, but the installation is failing. Which of the following could be causing the issue?

a. An antivirus software on the Guest OS is preventing the installation.

b. It is not possible to install VMware tools without first logging on to the VM.

c. You are performing an interactive installation.

d. The VMware tools software is waiting for you to complete the wizard.

4. Which of the following could cause a VM to become orphaned?

a. HA failover was unsuccessful for the VM.

b. The VM was vMotioned to a host that was powered off.

c. The host of the VM failed, and it was restarted on another host by HA.

d. The parent resource pool of the VM was accidentally deleted.

5. Which of the following might result from setting Admission Control Policy in HA too conservatively? (Choose two.)

a. You might not be able to start as many VMs as you had hoped.

b. VMs might be vMotioned more frequently than you had planned.

c. You might not be able to start a VM with high reservations.

d. All VMs will run with restricted settings.

6. Which of the following are tools that you could use to identify the root cause of a storage issue? (Choose two.)

a. Storage Maps in the vSphere Client

b. Task Manager on a VMs OS

c. The Storage Views tab in the vSphere Web Client

d. The DCUI of a host

7. You have created a VM using the wizard and identified the OS as Windows Server 2008. You attempt to install an ISO on the VM that contains Windows Server 2003. Which of the following will result? (Choose two.)

a. The ISO will install properly.

b. The installation will fail.

c. The boot disk on the VM will not be recognized.

d. The boot disk will be recognized but will not contain the proper ISO to run Windows Server 2008, so the installation will fail.

8. You want to improve your user experience on your vSphere 6 Web Client that has a default installation. Which of the following should you do? (Choose two.)

a. Move Recent Tasks to the bottom of the screen.

b. Unpin the right side of the screen.

c. Add a Summary tab to each of the objects that you use the most.

d. Hide the Getting Started tabs.

9. What are two practical techniques you can use to make sure that you are using the screen space on your vSphere Web Client to your best advantage? (Choose two.)

a. Leverage your browser tabs.

b. Hide the Getting Started tabs.

c. Log on as [email protected].

d. Unpin the sides.

10. Which of the following occur when you choose to acknowledge an alarm in your vCenter? (Choose two.)

a. Your credentials are recorded and you have taken the responsibility to fix the issue.

b. The alarm is reset to green and therefore disabled.

c. The inventory will no longer show the alert or warning on the object’s icon.

d. All actions associated with the alarm will be disabled.

Foundation Topics

Troubleshooting Virtual Machine Resource Contention Issues

This section focuses on troubleshooting storage issues on a VM. Storage is one of the “core four” resources that you need to know how to manage. The reason that I am focusing on storage in this section is that I have already discussed proper management and troubleshooting of the other three resources of the core four—namely, networking, CPU, and memory.

To troubleshoot storage contention issues, you should focus on the storage adapters that connect your hosts to their datastores. The settings for multipathing of your storage are in the Storage view. Click Manage, Settings, and then Connectivity and Multipathing; finally, click your host to show the Multipathing Details, as shown in Figure 22-1. You can change path selection policy after clicking Edit Multipathing, as shown in Figure 22-2.

Image
Image

Figure 22-1 Settings for Multipathing of Storage

Image
Image

Figure 22-2 Configuring Multipathing in the Storage View

Troubleshooting Storage Overcommitment Issues

As you continue to grow your vSphere, and your hosts and VMs are competing for the same resources, many factors can begin to affect storage performance. They include excessive SCSI reservations, path thrashing, and inadequate LUN queue depth. This section briefly discusses each of these issues.

Excessive Reservations Cause Slow Host Performance

Some operations require the system to get a file lock or a metadata lock in VMFS. They might include creating or expanding a datastore, powering on a VM, creating or deleting a file, creating a template, deploying a VM from a template, creating a new VM, migrating a VM with vMotion, changing a vmdk file from thin to thick, and so on. These types of operations create a short-lived SCSI reservation, which temporarily locks the entire LUN or at least the metadata database. As you can imagine, excessive SCSI reservations caused by activity on one host can cause performance degradation on other servers that are accessing the same VMFS. Actually, ESXi 6.0 does a much better job of handling this issue than legacy systems did because only the metadata is locked and not the entire LUN.

If you have older hosts and you need to address this issue, you should ensure that you have the latest BIOS updates installed on your hosts and that you have the latest host bus adapter (HBA) firmware installed across all hosts. You should also consider using more small logical unit numbers (LUNs) rather than fewer large LUNs for your datastores. In addition, you should reduce the number of VM snapshots because they can cause numerous SCSI reservations. Finally, follow the Configuration Maximums document and reduce the number of VMs per LUN to the recommended maximum, even if you have seen that you can actually add more than that figure.

Path Thrashing Causes Slow Performance

Path thrashing is most likely to occur on active-passive arrays. It’s caused by two hosts attempting to access the same LUN through different storage processors. The result is that the LUN is often seen as not available to both hosts. The default setting for the Patch Selection Policy (PSP) of Most Recently Used will generally keep this from occurring. In addition, ensure that all hosts that share the same set of LUNs on the active-passive arrays use the same storage processor. Properly configured active-active arrays do not cause path thrashing.

Troubleshooting iSCSI Software Initiator Configuration Issues

If your ESXi host generates more commands to a LUN than it can possibly handle, the excess commands are queued by the VMkernel. This situation causes increased latency, which can affect the performance of your VMs. It is generally caused by an improper setting of LUN queue depth, the setting of which varies by the type of storage. You should determine the proper LUN queue depth for your storage from your vendor documentation and then adjust your Disk.SchedNumReqOutstanding parameter accordingly.

Fault Tolerant VM Network Latency Issues

For specific VMs on which you are using vSphere Fault Tolerance (FT), you might encounter issues related to FT that you need to address. If you encounter network latency issues with regard to FT-enabled VMs, you should ensure that 10-Gbps links are being used between the hosts and that the available bandwidth for FT is kept very high. There will be an associated overhead for the FT traffic, but it should still function well with sufficient bandwidth.

Troubleshooting VMware Tools Installation Issues

VMware tools are highly recommended for every VM installation because they provide the drivers needed for advanced networking, efficient snapshotting, guest OS heartbeat, efficient memory handling, and much more. In most cases, you will just include VMware Tools as part of your template. In cases where you are installing the OS from the original software, you should install VMware Tools immediately after installing the OS on the VM.

Your Web Client will clearly inform you that VMware Tools are not installed on your VM, as shown in Figure 22-3. In general, VMware Tools are simple and “straightforward” to install. The one issue that you might encounter when installing VMware Tools if you choose to install them without logging onto the VM first is that you won’t know how the installation is proceeding; or even if it is proceeding. To avoid this issue, you should install VMware Tools interactively. To do this, log on to the desktop of the VM first and then go to VM, Guest, Install VMware Tools from the logon. In this way, you will be able to monitor the installation of VMware Tools. Installing VMware Tools interactively will also enable you to ensure that no antivirus software on the VMs OS is preventing the installation of VMware Tools.

Image
Image

Figure 22-3 Summary Tab Showing VMware Tools Not Installed on VM

Troubleshooting VM States

In rare cases, VMs that reside on an ESXi host connected to a vCenter server might become “lost” to the point that the VMs exist on the database of the vCenter but are not recognized by the host. This can happen if an HA failover is unsuccessful or if a VM is unregistered from the host instead of from the vCenter. In this case, you should right-click the VM, select to remove it from inventory, and then reregister the same VM by right-clicking its .vmx file and selecting to Register VM, as shown in Figure 22-4.

Image
Image

Figure 22-4 Reregistering an Orphaned VM

Virtual Machine Constraints

As you know, Admission Control Policy in HA causes each host to reserve enough resources to recover VMs in the case of another host’s failure on the same HA cluster. This means that if you set your Admission Control Policy too conservatively, you might not be able to start as many VMs as you may have thought possible. For example, changing from a policy that allows for only one host failure to one that allows two host failures can have a dramatic effect on the VM capacity of your cluster, especially in a small cluster. Therefore, the best thing you can do is verify that the settings that you expect to see are still there and that all the hosts you are counting on are still running.

Identifying the Root Cause of a Storage Issue

After you have obtained information from the reports and maps provided by your vCenter, you can use your knowledge of your systems to compare what you are viewing to what should be occurring. One “catch-22” is that the time that you are most likely to need the information is also the time at which it is most likely to be unavailable. For this reason, consider printing a copy of your storage maps when everything is running smoothly to be kept on hand for a time when you need to troubleshoot. Then if you have access to the current maps, you can compare what you are seeing with what you have in print. However, if you can no longer use the tools, you have the printed map to use as an initial guide until you can access the current configuration.

Identifying Common Virtual Machine Boot Disk Errors

This is kind of a funny topic because virtual machine boot disk errors are really no different from physical machine boot disk errors; the OS on the VM does not “know” that it’s on a VM. However, if you attempt to boot a VM and it cannot recognize the disk at all, the chances are very good that the wrong type of controller is configured on the virtual machine.

This usually results from going outside of what the wizard selects for an installation of a controller based on the OS that you select. For example, Windows Server 2003 will not recognize LSI Logic SAS because it didn’t exist when Windows Server 2003 was created. Therefore, for Windows Server 2003, you should use an “older” controller such as LSI Logic Parallel. If you follow the wizard during VM installation and make sure that the controller is appropriate for the OS, you should not have this issue; however, you can reconfigure the controller on a powered-off VM by editing the setting on the Virtual Hardware tab, as shown in Figure 22-5.

Image
Image

Figure 22-5 Changing Disk Controller on Virtual Hardware

Another much less common issue is when a VM has more than one virtual disk and the second disk contains only data and no OS. Chances are good that the data disk was created second and therefore will not be first in the boot order, so it should not cause an issue. However, if you are receiving an error that indicates that the disk can be read but does not contain an OS, then you should check the boot order in the BIOS of the VM, just as you would in a physical machine. You can make the BIOS settings available for a one-time boot by selecting the check box in the Option settings of the VM, as shown in Figure 22-6.

Image
Image

Figure 22-6 Forcing a VM to Boot to BIOS Settings

Setting and Managing Settings in the vSphere Web Client

There is much to be said about the vSphere Web Client. It’s a topic of heated discussion in many VMware circles. Some people are starting to like the Web Client, especially the latest version that comes with vSphere 6. Others may never like the vSphere Web Client and wish that VMware would just let everything run through the original desktop-based c## client. So, what settings can you change to get the most out of the vSphere Web Client? You can change many settings to improve your user experience, but following is a list of a few of my favorites.

Image

Image Take off the training wheels: Because you probably want to see the Summary tab when you select an object in the vSphere Web Client, you should hide the Getting Started pages as soon as possible. To do this, click Help in the upper-right corner of the client and click Hide All Getting Started Pages, as shown in Figure 22-7. You can bring them back by returning to the same area, but I’ll bet you never will!

Image
Image

Figure 22-7 Hiding the Getting Started Pages

Image Unpin the right side: Since VMware finally put the Recent Tasks back where they should have been all along, at the bottom of the client, you can now unpin the right side of the client and open it only when you want to grab a saved Work in Progress. This gives you significantly more desktop real estate, which is important if you are managing your vSphere from a laptop or a desktop that doesn’t include a 36-inch monitor!

Image Leverage your browser tabs: Because VMware seems to be dead set on making you manage your environment through your browser, you may as well take advantage of everything that type of configuration offers. In other words, don’t forget about the tabs that you have open for your vCenter, VMs, and so on. Sometimes it’s much easier to find a tab that is already open than it is to traverse the inventory again to find the same object. If you are not careful, you can end up with many objects open on your taskbar (Windows) that you have completely forgotten about. Leveraging the tabs on your browser, which are much more evident if you’re looking for them, will keep you more organized. Try it, and you’ll see what I mean.

Correcting Common Warnings and Alerts Within the vSphere Web Client

As you are working in your vSphere environment, many monitors are “keeping an eye” on things to make sure that you can get the most from it. Many default alarms are built in to the installation of your hosts, vCenter, storage, and so on. As you know, you can also add custom alarms for your own specific needs.

In Chapter 24, “Creating and Administering vCenter Server Alarms,” you learn how you can configure your own alarms, but for now this section focuses on what you can do when an alarm is triggered. This includes recognizing the warning or alert and then making the best decision to address it.

First, how will you know that an alarm has been triggered? In addition to the yellow or red icon that the object will be sporting, you will also see information about the triggered alarm in your logs and in your Events view. In addition, depending on how the alarm is configured, you or another administrator might receive an email or an SNMP trap. However, the best place to focus on triggered alarms is on the Monitor, Issues, Triggered Alarms link of the object itself, as shown in Figure 22-8. In this case, your vCenter has a Critical Issue in regard to Health Status Monitoring.

Image
Image

Figure 22-8 The Triggered Alarms Link

Now that you have found the alarm, what are your choices in regard to handling it? Well, you could ignore it and let someone else handle it, but you probably wouldn’t have searched for it if you were going to ignore it. You could disable it, but that doesn’t fix anything, and then you might forget about it entirely until the issue is even worse. Because neither of those are viable options, that means you have the following two options, as shown in Figure 22-9:

Image Reset to Green: If you choose Reset to Green without actually fixing the issue that is triggering the alarm, there is a good chance that the alarm will be triggered again. Therefore, you should choose this option only after you have identified and corrected the issue that caused the alarm to trigger.

Image Acknowledge: When you choose Acknowledge, you are taking the responsibility to work on the issue that is causing the alarm as soon as you find the opportunity. This will leave the “pretty colors” on the object in the inventory, but it will stop the other actions, such as emails and SNMP traps. It will also create a log entry that indicates that an administrator logged in with your credentials has acknowledged responsibility for addressing the issue.

Image
Image

Figure 22-9 Resetting or Acknowledging Triggered Alarms

Summary

This chapter covers the following main topics:

Image Troubleshooting techniques for VM contention issues

Image FT latency issues and VMware Tools installation issues

Image Troubleshooting VM states, such as with orphaned VMs

Image How to identify the root cause of a storage issue and how to correct the most common boot disk errors

Image Managing the vSphere Web Client, including settings, properties, and alerts and warnings.

Exam Preparation Tasks

Review All the Key Topics

Review the most important topics from the chapter, noted with the Key Topic icon in the outer margin of the page. Table 22-2 lists these key topics and the page numbers where each is found. Know how to perform basic troubleshooting on ESXi hosts, vSphere networks, vSphere storage, and HA/DRS clusters.

Image
Image

Table 22-2 Key Topics for Chapter 22

Review Questions

The answers to these review questions are in Appendix A.

1. Which of the following can have a negative effect on virtual machine performance? (Choose two.)

a. Excessive SCSI reservations

b. Multipathing

c. Lack of path thrashing

d. Inadequate LUN queue depth

2. Which of the following would not cause network latency issues on FT-enabled VMs? (Choose two.)

a. Hosts with properly configured 1 Gbps vmnics

b. Hosts with improperly configured 10 Gbps vmnics

c. VMs with vmxnet2 vnics

d. VMs with e1000e vnics

3. Which of the following are true regarding an interactive installation of VMware tools? (Choose two.)

a. It will enable you to detect whether an antivirus software package on the Guest OS is preventing the installation.

b. It can be accomplished only by logging in to the VM first.

c. You should perform a noninteractive installation and then an interactive installation.

d. An interactive installation of VMware tools will always install more drivers than the noninteractive installation will.

4. Which of the following is true about a VM that is orphaned? (Choose two.)

a. The VM files have been deleted from the datastore.

b. The VM does not appear on the database of the vCenter.

c. The VM is not recognized by its host.

d. The VM files are in the datastore.

5. Which of the following might result from setting Admission Control Policy in HA too liberally?

a. You might not be able to restart all the VMs on a host when it fails.

b. VMs might be vMotioned more frequently than you had planned.

c. You might not be able to start a VM with high reservations.

d. All VMs will run with restricted settings.

6. Which of the following are tools that will not assist you in identifying the root cause of a storage issue? (Choose two.)

a. Storage Maps in the vSphere Client

b. Task Manager on a VMs OS

c. The Storage Views tab in the vSphere Web Client

d. The DCUI of a host

7. You have created a VM using the wizard and identified the OS as Windows Server 2003. You attempt to install an ISO on the VM that contains Windows Server 2008. Which of the following will result?

a. The ISO will install properly, but your Summary tab will be wrong and your disk controller will not be optimal.

b. The installation will fail.

c. The boot disk on the VM will not be recognized.

d. The boot disk will be recognized but will not contain the proper ISO to run Windows Server 2008, so the installation will fail.

8. You want to improve your user experience on your vSphere 6 Web Client that has a default installation. Which of the following should you do? (Choose two.)

a. Add a Related Objects tab to the Manage Screen.

b. Unpin the right side of the screen.

c. Remove the Work In Progress feature from the software.

d. Hide the Getting Started tabs.

9. What are two practical techniques you can use to make sure that you are getting the most efficient use of your vSphere Web Client? (Choose two.)

a. Unpin both sides.

b. Log on to your vCenter with the desktop client to compare.

c. Log on to your vCenter as “root.”

d. Leverage the browser tabs.

10. Which of the following occur when you choose to acknowledge an alarm in your vCenter? (Choose two.)

a. Your credentials are recorded, and you have taken the responsibility to fix the issue.

b. The alarm is disabled.

c. The inventory object will show a green “Acknowledged” icon.

d. All actions associated with the alarm with be disabled.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.183.172