0%

Explore site reliability engineering practices and learn key Google Cloud Platform (GCP) services such as CSR, Cloud Build, Container Registry, GKE, and Cloud Operations to implement DevOps

Key Features

  • Learn GCP services for version control, building code, creating artifacts, and deploying secured containerized applications
  • Explore Cloud Operations features such as Metrics Explorer, Logs Explorer, and debug logpoints
  • Prepare for the certification exam using practice questions and mock tests

Book Description

DevOps is a set of practices that help remove barriers between developers and system administrators, and is implemented by Google through site reliability engineering (SRE).

With the help of this book, you'll explore the evolution of DevOps and SRE, before delving into SRE technical practices such as SLA, SLO, SLI, and error budgets that are critical to building reliable software faster and balance new feature deployment with system reliability. You'll then explore SRE cultural practices such as incident management and being on-call, and learn the building blocks to form SRE teams. The second part of the book focuses on Google Cloud services to implement DevOps via continuous integration and continuous delivery (CI/CD). You'll learn how to add source code via Cloud Source Repositories, build code to create deployment artifacts via Cloud Build, and push it to Container Registry. Moving on, you'll understand the need for container orchestration via Kubernetes, comprehend Kubernetes essentials, apply via Google Kubernetes Engine (GKE), and secure the GKE cluster. Finally, you'll explore Cloud Operations to monitor, alert, debug, trace, and profile deployed applications.

By the end of this SRE book, you'll be well-versed with the key concepts necessary for gaining Professional Cloud DevOps Engineer certification with the help of mock tests.

What you will learn

  • Categorize user journeys and explore different ways to measure SLIs
  • Explore the four golden signals for monitoring a user-facing system
  • Understand psychological safety along with other SRE cultural practices
  • Create containers with build triggers and manual invocations
  • Delve into Kubernetes workloads and potential deployment strategies
  • Secure GKE clusters via private clusters, Binary Authorization, and shielded GKE nodes
  • Get to grips with monitoring, Metrics Explorer, uptime checks, and alerting
  • Discover how logs are ingested via the Cloud Logging API

Who this book is for

This book is for cloud system administrators and network engineers interested in resolving cloud-based operational issues. IT professionals looking to enhance their careers in administering Google Cloud services and users who want to learn about applying SRE principles and implementing DevOps in GCP will also benefit from this book. Basic knowledge of cloud computing, GCP services, and CI/CD and hands-on experience with Unix/Linux infrastructure is recommended. You'll also find this book useful if you're interested in achieving Professional Cloud DevOps Engineer certification.

Downloading the example code for this book You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Table of Contents

  1. Google Cloud for DevOps Engineers
  2. Contributors
  3. About the author
  4. About the reviewers
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  6. Section 1: Site Reliability Engineering – A Prescriptive Way to Implement DevOps
  7. Chapter 1: DevOps, SRE, and Google Cloud Services for CI/CD
    1. Understanding DevOps, its evolution, and life cycle
    2. Revisiting DevOps evolution
    3. DevOps life cycle
    4. Key pillars of DevOps
    5. SRE's evolution; technical and cultural practices
    6. The evolution of SRE
    7. Understanding SRE
    8. SRE's approach toward DevOps' key pillars
    9. Introducing SRE's key concepts
    10. SRE's technical practices
    11. SRE's cultural practices
    12. Cloud-native approach to implementing DevOps using Google Cloud
    13. Focus on microservices
    14. Cloud-native development
    15. Continuous integration in GCP
    16. Continuous delivery/deployment in GCP
    17. Continuous monitoring/operations on GCP
    18. Bringing it all together – building blocks for a CI/CD pipeline in GCP
    19. Summary
    20. Points to remember
    21. Further reading
    22. Practice test
    23. Answers
  8. Chapter 2: SRE Technical Practices – Deep Dive
    1. Defining SLAs
    2. Key jargon
    3. Blueprint for a well-defined SLA
    4. SLIs drive SLOs, which inform SLAs
    5. Defining reliability expectations via SLOs
    6. SLOs drive business decisions
    7. Setting SLOs – the guidelines
    8. Exploring SLIs
    9. Categorizing user journeys
    10. SLI equation
    11. Sources to measure SLIs
    12. SLI best practices (Google-recommended)
    13. Understanding error budgets
    14. Error budget policy and the need for executive buy-in
    15. Making a service reliable
    16. Summarizing error budgets
    17. Eliminating toil through automation
    18. Illustrating the impact of SLAs, SLOs, and error budgets relative to SLI
    19. Scenario 1 – New service features introduced; features are reliable; SLO is met
    20. Scenario 2 – New features introduced; features are not reliable; SLO is not met
    21. Summary
    22. Points to remember
    23. Further reading
    24. Practice test
    25. Answers
  9. Chapter 3: Understanding Monitoring and Alerting to Target Reliability
    1. Understanding monitoring
    2. Monitoring as a feedback loop
    3. Monitoring misconceptions to avoid
    4. Monitoring sources
    5. Monitoring strategies
    6. Monitoring types
    7. The golden signals
    8. Alerting
    9. Alerting strategy – key attributes
    10. Alerting strategy – potential approaches
    11. Handling service with low traffic
    12. Steps to establish an SLO alerting policy
    13. Alerting system – desirable characteristics
    14. Time series
    15. Time series structure
    16. Time series cardinality
    17. Time series data – metric types
    18. Summary
    19. Points to remember
    20. Further reading
    21. Practice test
    22. Answers
  10. Chapter 4: Building SRE Teams and Applying Cultural Practices
    1. Building SRE teams
    2. Staffing SRE engineers (SREs)
    3. SRE team implementations – procedure and strategy
    4. SRE engagement model
    5. Incident management
    6. Incident life cycle
    7. Elements of effective incident management
    8. Being on call
    9. Paging versus non-paging events
    10. Single-site versus multi-site production teams
    11. Recommended practices while being on call
    12. Psychological safety
    13. Factors to overcome in order to foster psychological safety
    14. Sharing vision and knowledge and fostering collaboration
    15. Unified vision
    16. Communication and collaboration
    17. Summary
    18. Points to remember
    19. Further reading
    20. Practice test
    21. Answers
  11. Section 2: Google Cloud Services to Implement DevOps via CI/CD
  12. Chapter 5: Managing Source Code Using Cloud Source Repositories
    1. Technical requirements
    2. Introducing the key features
    3. Creating a repository via Google Cloud Console
    4. Creating a repository via the CLI
    5. Adding files to a repository in CSR
    6. One-way sync from GitHub/Bitbucket to CSR
    7. Common operations in CSR
    8. Browsing repositories
    9. Performing a universal code search
    10. Detecting security keys
    11. Assigning access controls
    12. Hands-on lab – integrating with Cloud Functions
    13. Adding code to an existing repository through the Cloud Shell Editor
    14. Pushing code from the Cloud Shell Editor (local repository) into CSR
    15. Creating a cloud function and deploying code from the repository in CSR
    16. Summary
    17. Further reading
    18. Practice test
    19. Answers
  13. Chapter 6: Building Code Using Cloud Build, and Pushing to Container Registry
    1. Technical requirements
    2. Key terminology (prerequisites)
    3. Understanding the need for automation
    4. Building and creating container images – Cloud Build
    5. Cloud Build essentials
    6. Building code using Cloud Build
    7. Storing and viewing build logs
    8. Managing access controls
    9. Cloud Build best practices – optimizing builds
    10. Managing build artifacts – Container Registry
    11. Container Registry – key concepts
    12. Hands-on lab – building, creating, pushing, and deploying a container to Cloud Run using Cloud Build triggers
    13. Creating an empty repository in Source Repositories
    14. Creating a Cloud Build trigger
    15. Adding code and pushing it to the master branch
    16. Code walk-through
    17. Viewing the results
    18. Summary
    19. Points to remember
    20. Further reading
    21. Practice test
    22. Answers
  14. Chapter 7: Understanding Kubernetes Essentials to Deploy Containerized Applications
    1. Technical requirements
    2. Kubernetes – a quick introduction
    3. Container orchestration
    4. Kubernetes features
    5. Kubernetes cluster anatomy
    6. Master components – Kubernetes control plane
    7. Node components
    8. Using kubectl
    9. Kubernetes objects
    10. Pod
    11. Deployment
    12. StatefulSets
    13. DaemonSets
    14. Service
    15. Scheduling and interacting with Pods
    16. Summarizing master plane interactions on Pod creation
    17. Critical factors to consider while scheduling Pods
    18. Kubernetes deployment strategies
    19. Recreate strategy
    20. Rolling update strategy
    21. Blue/Green strategy
    22. Canary deployment
    23. Summary
    24. Points to remember
    25. Further reading
    26. Practice test
    27. Answers
  15. Chapter 8: Understanding GKE Essentials to Deploy Containerized Applications
    1. Technical requirements
    2. Google Kubernetes Engine (GKE) – introduction
    3. Creating a GKE cluster
    4. GKE cluster – deploying and exposing an application
    5. GKE Console
    6. GKE – core features
    7. GKE node pools
    8. GKE cluster configuration
    9. AutoScaling in GKE
    10. Networking in GKE
    11. Storage options for GKE
    12. Cloud Operations for GKE
    13. GKE Autopilot – hands-on lab
    14. Summary
    15. Points to remember
    16. Further reading
    17. Practice test
    18. Answers
  16. Chapter 9: Securing the Cluster Using GKE Security Constructs
    1. Technical requirements
    2. Essential security patterns in Kubernetes
    3. Authentication
    4. Authorization
    5. Control plane security
    6. Pod security
    7. Hardening cluster security in GKE
    8. GKE private clusters
    9. Container-optimized OS
    10. Shielded GKE nodes
    11. Network Policies – restricting traffic among pods
    12. Workload Identity
    13. Points to remember
    14. Further reading
    15. Practice test
    16. Answers
  17. Chapter 10: Exploring GCP Cloud Operations
    1. Cloud Monitoring
    2. Workspaces
    3. Dashboards
    4. Metrics explorer
    5. Uptime checks
    6. Alerting
    7. Monitoring agent
    8. Cloud Monitoring access controls
    9. Cloud Logging
    10. Audit Logs
    11. Logs ingestion, routing, and exporting
    12. Summarizing log characteristics across log buckets
    13. Logs Explorer UI
    14. Logs-based metrics
    15. Network-based log types
    16. Logging agent
    17. Cloud Debugger
    18. Setting up Cloud Debugger
    19. Using Cloud Debugger
    20. Access control for Cloud Debugger
    21. Cloud Trace
    22. Trace Overview
    23. Trace List
    24. Analysis Reports
    25. Cloud Profiler
    26. Access control for Cloud Profiler
    27. Binding SRE and Cloud Operations
    28. SLO monitoring
    29. Hands-on lab – tracking service reliability using SLO monitoring
    30. Summary
    31. Points to remember
    32. Further reading
    33. Practice test
    34. Answers
  18. Appendix: Getting Ready for Professional Cloud DevOps Engineer Certification
    1. Cloud Deployment Manager
    2. Cloud Tasks
    3. Spinnaker
  19. Mock Exam 1
    1. Test Duration: 2 hours
    2. Total Number of Questions: 50
    3. Answers
  20. Mock Exam 2
    1. Test Duration: 2 hours
    2. Total Number of Questions: 50
    3. Answers
    4. Why subscribe?
  21. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Leave a review - let other readers know what you think
3.142.200.226