Creating alerts

As shown at the beginning of this chapter, alerts are a combination of symptom definitions. Any single symptom being true can cause an alert to activate, or all the symptoms need be true for the alert to then activate. This is completely up to how we configure it. For the purpose of this example, we will be creating an alert that will include symptoms concerning cluster computer resource health.

Perform the following steps to create an alert:

  1. Go to the vRealize Operations user UI on the master replica node by navigating to the following URL: https://<FQDN or IP of the Master Replica Node>/ui. Navigate to the Alerts section and select Alert Settings, and then Alert Definitions. Here we can see the alerts on the right-hand side. Here is where we can tweak existing ones or create our own. Let's create our own so we can see what goes into an alert. Click on the little green plus (+) icon.
  2. This then opens up the Alert Definition Workspace, which is very similar to the policy creation window. We need to first enter the Name and Description.
  3. Now we click on the Base Object Type bar. There is only a single thing to select here, which is the base object to which this alarm will be associated. Cluster Compute Resources was selected but could be Host or VM or Datastore; any object from any solution can be chosen.
  4. Next, we click on the Alert Impact bar. In this section, we choose the following:
    • Impact: Which major badge this will affect: health, risk, or efficiency.
    • Criticality: This is the type of alert: info, warning, immediate, critical, or symptom-based. The last one mentioned there will take the criticality of the symptoms to make up the alerts value.
    • Alert Type and Subtype: There are quite a number of values in here to select. In this example, we've selected Virtualization/Hypervisor. This will give the alert a category it fits into.
    • Wait Cycle: How many cycles in which the symptoms need to be triggered,  for the alarm to be triggered as well.
    • Cancel Cycle: This is the opposite of the Wait Cycle: how many cycles the alert remains active when the conditions are not met.

The following screenshot illustrates how it could be configured:

  1. Click on the Add Symptom Definitions bar—this is where the magic happens. In this section, we can choose the level at which we want the symptoms to apply; we have selected Self. We could also select a child, which in this case is a host, or a parent, which would be a Datacenter.

We then need to choose a symptom or 10. From the drop-down list, select the Symptom Definition Type and then, below that, select a symptom and drag it over to the right. We can drag and drop symptoms on top of each other to nest them, which is called a symptom set, in which case we will see multiple numbered symptoms in one box, or we can drag and drop into the dotted outlined area to create a new symptom set.

From the following screenshot, we can see all the previous data we entered at the top of the right-hand side and the metric symptoms we dragged over below. Each symptom set has the drop-down option to say whether All or Any of the symptoms need to be true to make the symptom set true. At the top of the symptom sets, we have a Match value, which can be set so that All or Any of the symptom sets must be true to activate the alert:

  1. Recommendations are the last step. This is optional when creating an alarm but I would highly recommend it, as it can save time, especially when you're not there to fix it. Click on the Add Recommendations bar and search for the recommendation we require and drag and drop it from the left to the right.

What we can also do is drag multiple recommendations over and give them a priority, so if we know a particular symptom we have chosen could have multiple courses of action, we can put them in here and all the information will help. This is shown in the following screenshot:

Click on the Save button and we have just created an alert. From this very simple example, we can see just how powerful the new alerting system is, as it's made up of multiple items. It allows us to become very specific with the alerts we want while providing guidance on how to fix the issue.

Now let's take a look at few examples on using multiple conditions, symptoms, symptom sets, and matching options.

Consider the following example:

  • We have defined three conditions.
  • We have mapped three symptoms to those conditions.
  • We have two symptom sets. The first symptom set has only one symptom. The second symptom set has two symptoms.
  • We have a matching condition (Any/All) for each set.
  • We have a matching condition (Any/All) for the whole alert.

In general, those settings can be illustrated as follows:

The alert definition example we gave earlier can be illustrated as follows:

As you can see, we have selected the following matching criteria:

  • Symptom Set 1: Symptom set is true when Any of the symptoms is true. In this example, we have only one symptom: Cluster contention at Critical level.
  • Symptom Set 2: Symptom set is true when All of the symptoms are true. In this example, we have two symptoms: Cluster CPU contention at Critical level and Cluster Compute Resource Anomaly is critically high.
  • Alert: The alert matching is set to All. This means all symptom sets must return true in order for an alert to be generated.

Now let's play out few examples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.96.155