This is an example runbook (mentioned in Chapter 3) for you to use in your own environment. This is a great starting point, and I encourage you to build on this and iterate over time. A runbook is only as good as the information in it, so if you find you need different sections, by all means, create them!
The Rails Demo App is a simple Rails blog app, showing off how a basic Rails app might look. The main components are a database-backed user management system and a post/comment system.
The codebase is located in the internal source code system under the name demo-app
.
The service owner is John Doe.
In the event assistance is needed to resolve an issue with this service, the service owner has requested to be the next escalation point. See the company contact sheet for contact instructions.
No external dependencies
PostgreSQL database, running on an RDS instance located at rds-123.foo.com.
Rails 4.x
PostgreSQL (AWS RDS)
The app emits the following metrics:
User login (count)
User logout (count)
Post create (count)
Post delete (count)
Comment create (count)
Comment delete (count)
Post create time (timer)
Post delete time (timer)
User signup time (timer)
User login time (timer)
User logout time (timer)
The app emits the following logs:
User signin with user ID, status (success/fail), and IP address
Post create with user ID, status (success/fail), and IP address
Comment create with user ID, status (success/fail), and IP address
This alert fires when the rate of user signin failures goes above 5% in a 5 m period. Potential causes are a bad deploy (check for recent deploys) or a brute force attack (check the user signin log for signs of an attack).
This alert fires when the time it takes for a user to login exceeds one second. Check for a recent bad deploy or an issue with Postgres performance.
This alert fires when the time it takes for a user to create a post exceeds one second. Check for a recent bad deploy or an issue with Postgres performance.
This alert fires when the time it takes for a user to create a comment exceeds one second. Check for a recent bad deploy or an issue with Postgres performance.
18.119.111.9