As mentioned earlier, consistency is invaluable to ensuring that your SLOs are understandable to everyone. This is where templatized SLO definition documents come into play.
SLO Dashboard: Link
Primary Author(s): Who owns this document and should be contacted about it?
Team: Who owns this service and should be contacted about it?
Collaborators: Anyone who contributed but isn’t a primary author?
Original Proposal Date: Date
Last Updated Date: Date
Approval Date: Date
Next Revisit Date: Date
Approver(s):
Approver | Status | Date |
---|---|---|
Name | Yes/No/Pending | YYYY-MM-DD |
Insert as many rows as you need. |
Briefly describe the service in question here. Keep things to about a paragraph. You can provide links to additional documentation about the service if you would like. Focus on the service from the viewpoint of its users (whether those are humans or other services).
Dashboard: Link to where people can get a visual representation of your performance.
Category | SLI | SLO |
---|---|---|
Parta | ||
Categoryb | Descriptionc | SLO1d |
Querye | SLO2f | |
Part | ||
Category | Description | SLO1 |
Query | SLO2 | |
Insert or remove as many rows as you need. | ||
a The part or component of your service that is being addressed by a certain SLI. For example, this could be an API, a public-facing HTTP server, a data processing pipeline, or something else. Your service might have only one or many components that warrant an SLI. b The type of SLI being measured. For example, this could be availability, latency, data correctness, data freshness, and so on. c A human-readable description of what is being measured. For example, “The proportion of successful HTTP requests from external sources.” d The SLO that is being informed by the SLI in question. For example, “95% of requests < 200 ms.” e The actual query from your systems that deliver the SLI. f A single SLI might drive multiple SLOs. For example, while “95% of requests < 200 ms” might be your first SLO, you might also want to ensure that “98% of requests < 400 ms.” |
Provide a short rationale for why these SLIs and SLOs were chosen. Try to keep this to a paragraph or so. You can link to additional documentation here if you would like.
Revision | Date | Details |
---|---|---|
Revision # | YYYY-MM-DD | Summary of the changes |
Insert as many more rows as you need. |
Describe here how often you plan to revisit the defined values in this document and send it back out for approval. When first establishing your SLIs and SLOs, this should be frequent (once a month is a good starting point), but as your values become more in line with reality, you can scale this back to every quarter or even every year.
Error budget | Threshold | Actions |
---|---|---|
SLO | X | Action to be taken |
Insert as many more rows as you need. |
3.15.6.77