Network counters at the VM level

The following screenshot shows the counters vCenter provides for network at the VM layer. The counters are available at each individual vNIC level and at the VM level. Most VMs will only have one vNIC, so the data at the VM and vNIC levels will be identical. The vNICs are named using the 400x convention.

This means that the first vNIC is 4000, the second vNIC is 4001, and so on:

Network counters at the VM level

VM network counters

As usual, let's approach the counters, starting with contention. There is no latency counter, so you cannot track how long it takes for a packet to reach its destination. There are, however, counters that track dropped packets. Dropped packets need to be retransmitted and therefore increase network latency from the application point of view. vCenter does not provide a counter to track packet retransmits.

vRealize Operations provides a latency counter, which uses packet drops as an indicator. Using a percentage is certainly easier than dealing with the raw counters in vCenter. The packet drop percent age is based on the packets transmitted and received in that collection period. These two counters are not collected by default.

You certainly want to avoid having packet drops in your network. To monitor whether any VM is experiencing packet drop, you can build a super metric and develop a dashboard with a line chart and Top-N widgets. The super metric tracks the maximum packet drop of all VMs. You apply it at the appropriate level (for example, cluster, datacenter, vCenter) and plot a line chart. You should expect a flat line at 0 when the network is performing well.

The line chart, however, does not tell you which VM experiences packet drop if you have any. This is where the Top-N chart comes in. You can set it, say, to top-25 VM and make the time range long (for example, 1 month). This is available in an existing dashboard.

For Utilization, vCenter provides the data both in terms of the number of TCP/IP packets and network throughput. There is also a Usage counter, which is the sum of the Data Transmit Rate (TX) and Data Receive Rate (RX). The Usage counter cannot exceed the physical wire speed, even with full duplex. So, if the VM is sending 800 Mbps to another VM in another ESXi host, it can only receive 224 Mbps since the total (TX + RX) cannot exceed 1000 Mbps.

The limit can certainly be exceeded if the communication is between two VMs in the ESXi hosts, as the packets move at memory speed.

The following chart shows that Usage counter is the sum of RX and TX. In vRealize Operations, use the Usage Rate counter. The counter 4000|Usage Rate will only give the data for the first vNIC; hence, it will be incomplete if you have a VM with two vNICs (for example, those with LAN-based backup or having access to multiple networks):

Network counters at the VM level

VM network metrics in vRealize Operations

You will notice that the numbers provided by vCenter and vRealize Operations are given in KBPS, while your vmnic is in Gbps. 1 Gbps equals 131,072 KBPS, so this is the theoretical maximum for a 1 GE physical card. Because vCenter takes a 20-second average, you will not see this number most of the time as it means that the throughput is sustained for the full 20 seconds. vRealize Operations will provide an even lower figure as the number is averaged over 5 minutes. You can reduce this to 1 minute if you have allowed for increased resource utilization (vRealize Operations VM and network infrastructure).

There are duplicate counters, as shown in the next screenshot. There are two data transmit rates and two data receive rates. The following data is from a vCenter 5.5 Update 1 appliance. As you can tell, there is a regular spike every few minutes or so. The load is primarily due to two vRealize Operations instances (5.8.1 and 6.0) accessing vCenter:

Network counters at the VM level

VM network metrics in vCenter

You may want to know whether any given VM hits the network limit. Assuming you are on a 1 GE network, you can do this by creating a super metric that tracks the maximum Usage (KBPS) of all VMs, multiplying that value by 8, and then dividing it by 1000 * 1000 to convert to Gbps. If you see a number nearing 1, it means you have a VM hitting 1 GE (which is the limit that the VM sees. The actual limit is likely to be lower since many VMs will be sharing the 1 GE vmnic).

Besides unicast traffic, which should form the bulk of your network, vSphere also provides information about broadcast and multicast traffic. If you are not expecting any of this traffic from certain VMs (or clusters) and want to be alerted if it does occur, you can create a group for the objects and then apply a super metric. The super metric would add the four counters that capture broadcast and multicast. You should expect a flat line as the total should be near 0.

We have talked about Contention and Utilization being the main areas you should check. The following table summarizes what you should monitor for the network. Notice that vCenter does not provide the total packet drops. Also, the unit is in the number of packet drops, not in percent. For utilization, vCenter does not have the equivalent of Workload. I've kept the table cell blank for ease of comparison:

Purpose

vCenter

vRealize Operations

Contention

Transmit packets dropped (number)

Received packets dropped (number)

Packet Dropped (%)

Utilization

Usage (KBPS)

Usage Rate (KBPS)

Utilization

 

Workload (%)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.39.59