0%

Discover expert techniques for combining machine learning with the analytic capabilities of Elastic Stack and uncover actionable insights from your data

Key Features

  • Integrate machine learning with distributed search and analytics
  • Preprocess and analyze large volumes of search data effortlessly
  • Operationalize machine learning in a scalable, production-worthy way

Book Description

Elastic Stack, previously known as the ELK stack, is a log analysis solution that helps users ingest, process, and analyze search data effectively. With the addition of machine learning, a key commercial feature, the Elastic Stack makes this process even more efficient. This updated second edition of Machine Learning with the Elastic Stack provides a comprehensive overview of Elastic Stack's machine learning features for both time series data analysis as well as for classification, regression, and outlier detection.

The book starts by explaining machine learning concepts in an intuitive way. You'll then perform time series analysis on different types of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you'll deploy machine learning within Elastic Stack for logging, security, and metrics. Finally, you'll discover how data frame analysis opens up a whole new set of use cases that machine learning can help you with.

By the end of this Elastic Stack book, you'll have hands-on machine learning and Elastic Stack experience, along with the knowledge you need to incorporate machine learning in your distributed search and data analysis platform.

What you will learn

  • Find out how to enable the ML commercial feature in the Elastic Stack
  • Understand how Elastic machine learning is used to detect different types of anomalies and make predictions
  • Apply effective anomaly detection to IT operations, security analytics, and other use cases
  • Utilize the results of Elastic ML in custom views, dashboards, and proactive alerting
  • Train and deploy supervised machine learning models for real-time inference
  • Discover various tips and tricks to get the most out of Elastic machine learning

Who this book is for

If you're a data professional looking to gain insights into Elasticsearch data without having to rely on a machine learning specialist or custom development, then this Elastic Stack machine learning book is for you. You'll also find this book useful if you want to integrate machine learning with your observability, security, and analytics applications. Working knowledge of the Elastic Stack is needed to get the most out of this book.

Table of Contents

  1. Machine Learning with the Elastic Stack
  2. Second Edition
  3. Contributors
  4. About the authors
  5. About the reviewers
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  7. Section 1 – Getting Started with Machine Learning with Elastic Stack
  8. Chapter 1: Machine Learning for IT
    1. Overcoming the historical challenges in IT
    2. Dealing with the plethora of data
    3. The advent of automated anomaly detection
    4. Unsupervised versus supervised ML
    5. Using unsupervised ML for anomaly detection
    6. Defining unusual
    7. Learning what's normal
    8. Probability models
    9. Learning the models
    10. De-trending
    11. Scoring of unusualness
    12. The element of time
    13. Applying supervised ML to data frame analytics
    14. The process of supervised learning
    15. Summary
  9. Chapter 2: Enabling and Operationalization
    1. Technical requirements
    2. Enabling Elastic ML features
    3. Enabling ML on a self-managed cluster
    4. Enabling ML in the cloud – Elasticsearch Service
    5. Understanding operationalization
    6. ML nodes
    7. Jobs
    8. Bucketing data in a time series analysis
    9. Feeding data to Elastic ML
    10. The supporting indices
    11. Anomaly detection orchestration
    12. Anomaly detection model snapshots
    13. Summary
  10. Section 2 – Time Series Analysis – Anomaly Detection and Forecasting
  11. Chapter 3: Anomaly Detection
    1. Technical requirements
    2. Elastic ML job types
    3. Dissecting the detector
    4. The function
    5. The field
    6. The partition field
    7. The by field
    8. The over field
    9. The "formula"
    10. Exploring the count functions
    11. Other counting functions
    12. Detecting changes in metric values
    13. Metric functions
    14. Understanding the advanced detector functions
    15. rare
    16. Frequency rare
    17. Information content
    18. Geographic
    19. Time
    20. Splitting analysis along categorical features
    21. Setting the split field
    22. The difference between splitting using partition and by_field
    23. Understanding temporal versus population analysis
    24. Categorization analysis of unstructured messages
    25. Types of messages that are good candidates for categorization
    26. The process used by categorization
    27. Analyzing the categories
    28. Categorization job example
    29. When to avoid using categorization
    30. Managing Elastic ML via the API
    31. Summary
  12. Chapter 4: Forecasting
    1. Technical requirements
    2. Contrasting forecasting with prophesying
    3. Forecasting use cases
    4. Forecasting theory of operation
    5. Single time series forecasting
    6. Looking at forecast results
    7. Multiple time series forecasting
    8. Summary
  13. Chapter 5: Interpreting Results
    1. Technical requirements
    2. Viewing the Elastic ML results index
    3. Anomaly scores
    4. Bucket-level scoring
    5. Normalization
    6. Influencer-level scoring
    7. Influencers
    8. Record-level scoring
    9. Results index schema details
    10. Bucket results
    11. Record results
    12. Influencer results
    13. Multi-bucket anomalies
    14. Multi-bucket anomaly example
    15. Multi-bucket scoring
    16. Forecast results
    17. Querying for forecast results
    18. Results API
    19. Results API endpoints
    20. Getting the overall buckets API
    21. Getting the categories API
    22. Custom dashboards and Canvas workpads
    23. Dashboard "embeddables"
    24. Anomalies as annotations in TSVB
    25. Customizing Canvas workpads
    26. Summary
  14. Chapter 6: Alerting on ML Analysis
    1. Technical requirements
    2. Understanding alerting concepts
    3. Anomalies are not necessarily alerts
    4. In real-time alerting, timing matters
    5. Building alerts from the ML UI
    6. Defining sample anomaly detection jobs
    7. Creating alerts against the sample jobs
    8. Simulating some real-time anomalous behavior
    9. Receiving and reviewing the alerts
    10. Creating an alert with a watch
    11. Understanding the anatomy of the legacy default ML watch
    12. Custom watches can offer some unique functionality
    13. Summary
  15. Chapter 7: AIOps and Root Cause Analysis
    1. Technical requirements
    2. Demystifying the term ''AIOps''
    3. Understanding the importance and limitations of KPIs
    4. Moving beyond KPIs
    5. Organizing data for better analysis
    6. Custom queries for anomaly detection datafeeds
    7. Data enrichment on ingest
    8. Leveraging the contextual information
    9. Analysis splits
    10. Statistical influencers
    11. Bringing it all together for RCA
    12. Outage background
    13. Correlation and shared influencers
    14. Summary
  16. Chapter 8: Anomaly Detection in Other Elastic Stack Apps
    1. Technical requirements
    2. Anomaly detection in Elastic APM
    3. Enabling anomaly detection for APM
    4. Viewing the anomaly detection job results in the APM UI
    5. Creating ML Jobs via the data recognizer
    6. Anomaly detection in the Logs app
    7. Log categories
    8. Log anomalies
    9. Anomaly detection in the Metrics app
    10. Anomaly detection in the Uptime app
    11. Anomaly detection in the Elastic Security app
    12. Prebuilt anomaly detection jobs
    13. Anomaly detection jobs as detection alerts
    14. Summary
  17. Section 3 – Data Frame Analysis
  18. Chapter 9: Introducing Data Frame Analytics
    1. Technical requirements
    2. Learning how to use transforms
    3. Why are transforms useful?
    4. The anatomy of a transform
    5. Using transforms to analyze e-commerce orders
    6. Exploring more advanced pivot and aggregation configurations
    7. Discovering the difference between batch and continuous transforms
    8. Analyzing social media feeds using continuous transforms
    9. Using Painless for advanced transform configurations
    10. Introducing Painless
    11. Working with Python and Elasticsearch
    12. A brief tour of the Python Elasticsearch clients
    13. Summary
    14. Further reading
  19. Chapter 10: Outlier Detection
    1. Technical requirements
    2. Discovering the four techniques used for outlier detection
    3. Understanding feature influence
    4. How does outlier detection differ from anomaly detection?
    5. Applying outlier detection in practice
    6. Evaluating outlier detection with the Evaluate API
    7. Hyperparameter tuning for outlier detection
    8. Summary
  20. Chapter 11: Classification Analysis
    1. Technical requirements
    2. Classification: from data to a trained model
    3. Feature engineering
    4. Evaluating the model
    5. Taking your first steps with classification
    6. Classification under the hood: gradient boosted decision trees
    7. Introduction to decision trees
    8. Gradient boosted decision trees
    9. Hyperparameters
    10. Interpreting results
    11. Summary
    12. Further reading
  21. Chapter 12: Regression
    1. Technical requirements
    2. Using regression analysis to predict house prices
    3. Using decision trees for regression
    4. Summary
    5. Further reading
  22. Chapter 13: Inference
    1. Technical requirements
    2. Examining, exporting, and importing your trained models with the Trained Models API
    3. A tour of the Trained Models API
    4. Exporting and importing trained models with the Trained Models API and Python
    5. Understanding inference processors and ingest pipelines
    6. Handling missing or corrupted data in ingest pipelines
    7. Using inference processor configuration options to gain more insight into your predictions
    8. Importing external models into Elasticsearch using eland
    9. Learning about supported external models in eland
    10. Training a scikit-learn DecisionTreeClassifier and importing it into Elasticsearch using eland
    11. Summary
  23. Appendix: Anomaly Detection Tips
    1. Technical requirements
    2. Understanding influencers in split versus non-split jobs
    3. Using one-sided functions to your advantage
    4. Ignoring time periods
    5. Ignoring an upcoming (known) window of time
    6. Ignoring an unexpected window of time, after the fact
    7. Using custom rules and filters to your advantage
    8. Creating custom rules
    9. Benefiting from custom rules for a "top-down" alerting philosophy
    10. Anomaly detection job throughput considerations
    11. Avoiding the over-engineering of a use case
    12. Using anomaly detection on runtime fields
    13. Summary
    14. Why subscribe?
  24. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Leave a review - let other readers know what you think
13.58.151.231