SLIs and SLOs – setting goals

SLIs and SLOs are two paradigms that were brought to the computer science world by Google. They are defined in the SRE workbook
(https://landing.google.com/sre/sre-book/chapters/service-level-objectives/) and are an excellent way to measure actionable items within your computing system. These measurements normally follow Google's four golden signals:

  • Latency: The amount of time a request takes to complete (usually measured in milliseconds)
  • Traffic: The volume of traffic that your service is receiving (usually measured in requests per second)
  • Errors: The percentage of failed requests over total requests (usually measured with a  percentage)
  • Saturation: The measure of hardware saturation (usually measured by queued request counts)

These measurements can then be used to create one or more SLAs. These are frequently delivered to customers who expect a specific level of service from your application.

We can use Prometheus to measure these metrics. Prometheus has a bunch of different methodologies for counting things, including gauges, counters, and histograms. We will use all of these different tools to measure these metrics within our system.

To test our system, we'll use the hey load generator. This is a tool that is similar to ab, which we used in previous chapters, but it'll show our distribution a little better for this particular scenario. We can grab it by running the following command:

go get -u github.com/rakyll/hey

We are going to need to stand up our Prometheus service in order to read some of these values. If yours isn't still standing from our previous example, we can perform the following commands:

docker build -t slislo -f Dockerfile.promservice .
docker run -it --rm --name slislo -d -p 9090:9090 --net host slislo

This will get our Prometheus instance to stand up and measure requests:

  1. Our code starts by instantiating the main package and importing the necessary Prometheus packages:
package main

import (
"math/rand"
"net/http"
"time"

"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"

)
  1. We then gather our saturation, requests, and latency numbers in our main function. We use a gauge for saturation, a counter for requests, and a histogram for latency: 
    saturation := prometheus.NewGauge(prometheus.GaugeOpts{
Name: "saturation",
Help: "A gauge of the saturation golden signal",
})

requests := prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "requests",
Help: "A counter for the requests golden signal",
},
[]string{"code", "method"},
)

latency := prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "latency",
Help: "A histogram of latencies for the latency golden
signal",
Buckets: []float64{.025, .05, 0.1, 0.25, 0.5, 0.75},
},
[]string{"handler", "method"},
)
  1. We then create our goldenSignalHandler, which randomly generates a latency from 0 to 1 seconds. For added visibility of our signals, if the random number is divisible by 4, we return a 404 error status, and if it's divisible by 5, we return a 500 error. We then return a response and log that the request has been completed.

Our goldenSignalChain ties these metrics together:

goldenSignalChain := promhttp.InstrumentHandlerInFlight
(saturation,promhttp.InstrumentHandlerDuration
(latency.MustCurryWith(prometheus.Labels{"handler": "signals"}),

promhttp.InstrumentHandlerCounter(requests, goldenSignalHandler),
),
)
  1. We then register all of our measurements (saturation, requests, and latency) with Prometheus, handle our HTTP requests, and start our HTTP server:
    prometheus.MustRegister(saturation, requests, latency)
http.Handle("/metrics", promhttp.Handler())
http.Handle("/signals", goldenSignalChain)
http.ListenAndServe(":1234", nil)
}
  1. After we start our HTTP server by executing go run SLISLO.go, we can then make a hey request to our HTTP server. The output from our hey call is visible in the following screenshot. Remember that these are all random values and will be different if you execute this same test:

We can then take a look at our individual golden signals.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.82.244