Google's rationale regarding monitoring is quite simple. It states, pretty straightforwardly, that the four most important metrics to keep track of are the following:
- Latency: The time required to serve a request
- Traffic: The number of requests being made
- Errors: The rate of failing requests
- Saturation: The amount of work not being processed, which is usually queued