0%

A comprehensive guide to rolling out Datadog to monitor infrastructure and applications running in both cloud and datacenter environments

Key Features

  • Learn Datadog to proactively monitor your infrastructure and cloud services
  • Use Datadog as a platform for aggregating monitoring efforts in your organization
  • Leverage Datadog's alerting service to implement on-call and site reliability engineering (SRE) processes

Book Description

Datadog is an essential cloud monitoring and operational analytics tool which enables the monitoring of servers, virtual machines, containers, databases, third-party tools, and application services. IT and DevOps teams can easily leverage Datadog to monitor infrastructure and cloud services, and this book will show you how.

The book starts by describing basic monitoring concepts and types of monitoring that are rolled out in a large-scale IT production engineering environment. Moving on, the book covers how standard monitoring features are implemented on the Datadog platform and how they can be rolled out in a real-world production environment. As you advance, you'll discover how Datadog is integrated with popular software components that are used to build cloud platforms. The book also provides details on how to use monitoring standards such as Java Management Extensions (JMX) and StatsD to extend the Datadog platform. Finally, you'll get to grips with monitoring fundamentals, learn how monitoring can be rolled out using Datadog proactively, and find out how to extend and customize the Datadog platform.

By the end of this Datadog book, you will have gained the skills needed to monitor your cloud infrastructure and the software applications running on it using Datadog.

What you will learn

  • Understand monitoring fundamentals, including metrics, monitors, alerts, and thresholds
  • Implement core monitoring requirements using Datadog features
  • Explore Datadog's integration with cloud platforms and tools
  • Extend Datadog using custom scripting and standards such as JMX and StatsD
  • Discover how proactive monitoring can be rolled out using various Datadog features
  • Understand how Datadog can be used to monitor microservices in both Docker and Kubernetes environments
  • Get to grips with advanced Datadog features such as APM and Security Monitoring

Who this book is for

This book is for DevOps engineers, site reliability engineers (SREs), IT Production engineers, software developers and architects, cloud engineers, system administrators, and anyone looking to monitor and visualize their infrastructure and applications with Datadog. Basic working knowledge of cloud and infrastructure is useful. Working experience of Linux distribution and some scripting knowledge is required to fully take advantage of the material provided in the book.

Table of Contents

  1. Datadog Cloud Monitoring Quick Start Guide
  2. Contributors
  3. About the author
  4. About the reviewer
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  6. Section 1: Getting Started with Datadog
  7. Chapter 1: Introduction to Monitoring
    1. Technical requirements
    2. Why monitoring?
    3. Proactive monitoring
    4. Implementing a comprehensive monitoring solution
    5. Setting up alerts to warn of impending issues
    6. Having a feedback loop
    7. Monitoring use cases
    8. All in a data center
    9. Application in a data center with cloud monitoring
    10. All in the cloud
    11. Monitoring terminology and processes
    12. Host
    13. Agent
    14. Metrics
    15. Up/down status
    16. Check
    17. Threshold
    18. Monitor
    19. Alert
    20. Alert recipient
    21. Severity level
    22. Notification
    23. Downtime
    24. Event
    25. Incident
    26. On call
    27. Runbook
    28. Types of monitoring
    29. Infrastructure monitoring
    30. Platform monitoring
    31. Application monitoring
    32. Business monitoring
    33. Last-mile monitoring
    34. Log aggregation
    35. Meta-monitoring
    36. Noncore monitoring
    37. Overview of monitoring tools
    38. On-premises tools
    39. SaaS solutions
    40. Cloud-native tools
    41. Summary
  8. Chapter 2: Deploying the Datadog Agent
    1. Technical requirements
    2. Installing the Datadog Agent
    3. Runtime configurations
    4. Steps for installing the agent
    5. Agent components
    6. Agent as a container
    7. Deploying the agent – use cases
    8. All on the hosts
    9. Agent on the host monitoring containers
    10. Agent running as a container
    11. Advanced agent configuration
    12. Best practices
    13. Summary
  9. Chapter 3: The Datadog Dashboard
    1. Technical requirements
    2. Infrastructure List
    3. Events
    4. Metrics Explorer
    5. Dashboards
    6. The main Integrations menu
    7. Integrations
    8. APIs
    9. Agent
    10. Embeds
    11. Monitors
    12. Creating a new metric monitor
    13. Advanced features
    14. Summary
  10. Chapter 4: Account Management
    1. Technical requirements
    2. Managing users
    3. Granting custom access using roles
    4. Setting up organizations
    5. Implementing Single Sign-On
    6. Managing API and application keys
    7. Tracking usage
    8. Best practices
    9. Summary
  11. Chapter 5: Metrics, Events, and Tags
    1. Technical requirements
    2. Understanding metrics in Datadog
    3. Metric data
    4. Flush time interval
    5. Metric type
    6. Metric unit
    7. Query
    8. Tagging Datadog resources
    9. Defining tags
    10. Tagging methods
    11. Customizing host tag
    12. Tagging integration metrics
    13. Tags from microservices
    14. Filtering using tags
    15. Defining custom metrics
    16. Monitoring event streams
    17. Searching events
    18. Notifications for events
    19. Generating events
    20. Best practices
    21. Summary
  12. Chapter 6: Monitoring Infrastructure
    1. Technical requirements
    2. Inventorying the hosts
    3. CPU usage
    4. Load averages
    5. Available swap
    6. Disk latency
    7. Memory breakdown
    8. Disk usage
    9. Network traffic
    10. Listing containers
    11. Viewing system processes
    12. Monitoring serverless computing resources
    13. Best practices
    14. Summary
  13. Chapter 7: Monitors and Alerts
    1. Technical requirements
    2. Setting up monitors
    3. Managing monitors
    4. Distributing notifications
    5. Configuring downtime
    6. Best practices
    7. Summary
  14. Section 2: Extending Datadog
  15. Chapter 8: Integrating with Platform Components
    1. Technical requirements
    2. Configuring an integration
    3. Tagging an integration
    4. Reviewing supported integrations
    5. Implementing custom checks
    6. Best practices
    7. Summary
  16. Chapter 9: Using the Datadog REST API
    1. Technical requirements
    2. Scripting Datadog
    3. curl
    4. Python
    5. Reviewing Datadog APIs
    6. Public cloud integration
    7. Dashboards
    8. Downtime
    9. Events
    10. Hosts
    11. Metrics
    12. Monitors
    13. Host tags
    14. Programming with Datadog APIs
    15. The problem
    16. Posting metric data and an event
    17. Creating a monitor
    18. Querying the events stream
    19. Best practices
    20. Summary
  17. Chapter 10: Working with Monitoring Standards
    1. Technical requirements
    2. Monitoring networks using SNMP
    3. Consuming application metrics using JMX
    4. Cassandra as a Java application
    5. Using Cassandra integration
    6. Accessing the Cassandra JMX interface
    7. Working with the DogStatsD interface
    8. Publishing metrics
    9. Posting events
    10. Best practices
    11. Summary
  18. Chapter 11: Integrating with Datadog
    1. Technical requirements
    2. Using client libraries
    3. REST API-based client libraries
    4. DogStatsD client libraries
    5. Evaluating community projects
    6. dog-watcher by Brightcove
    7. kennel
    8. Managing monitors using Terraform
    9. Ansible modules and integration
    10. Developing integrations
    11. Prerequisites
    12. Setting up the tooling
    13. Creating an integration folder
    14. Running tests
    15. Building a configuration file
    16. Building a package
    17. Deploying an integration
    18. Best practices
    19. Summary
  19. Section 3: Advanced Monitoring
  20. Chapter 12: Monitoring Containers
    1. Technical requirements
    2. Collecting Docker logs
    3. Monitoring Kubernetes
    4. Installing the Datadog Agent
    5. Using Live Containers
    6. Viewing logs using Live Tail
    7. Searching container data
    8. Best practices
    9. Summary
  21. Chapter 13: Managing Logs Using Datadog
    1. Technical requirements
    2. Collecting logs
    3. Collecting logs from public cloud services
    4. Shipping logs from containers
    5. Shipping logs from hosts
    6. Filtering logs
    7. Scrubbing sensitive data from logs
    8. Processing logs
    9. Archiving logs
    10. Searching logs
    11. Best practices
    12. Summary
  22. Chapter 14: Miscellaneous Monitoring Topics
    1. Technical requirements
    2. Application Performance Monitoring (APM)
    3. Sending traces to Datadog
    4. Profiling an application
    5. Service Map
    6. Implementing observability
    7. Synthetic monitoring
    8. Security monitoring
    9. Sourcing the logs
    10. Defining security rules
    11. Monitoring security signals
    12. Best practices
    13. Summary
    14. Why subscribe?
  23. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Leave a review - let other readers know what you think
44.203.235.24