0%

Fighting Churn with Data teaches developers and data scientists proven techniques for stopping churn before it happens. Packed with real-world use cases and examples, this book teaches you to convert raw data into measurable behavior metrics, calculate customer lifetime value, and improve churn forecasting with demographic data. By following Zuora Chief Data Scientist Carl Gold’s methods, you’ll reap the benefits of high customer retention.

Table of Contents

  1. Fighting Churn with Data
  2. Copyright
  3. brief contents
  4. contents
  5. front matter
    1. foreword
    2. preface
    3. acknowledgments
    4. about this book
    5. Who should read this book
    6. How this book is organized: A road map
    7. About the code
    8. liveBook discussion forum
    9. Other online resources
    10. about the author
    11. about the cover illustration
  6. Part 1. Building your arsenal
  7. 1 The world of churn
    1. 1.1 Why you are reading this book
    2. 1.1.1 The typical churn scenario
    3. 1.1.2 What this book is about
    4. 1.2 Fighting churn
    5. 1.2.1 Interventions that reduce churn
    6. 1.2.2 Why churn is hard to fight
    7. 1.2.3 Great customer metrics: Weapons in the fight against churn
    8. 1.3 Why this book is different
    9. 1.3.1 Practical and in-depth
    10. 1.3.2 Simulated case study
    11. 1.4 Products with recurring user interactions
    12. 1.4.1 Paid consumer products
    13. 1.4.2 Business-to-business services
    14. 1.4.3 Ad-supported media and apps
    15. 1.4.4 Consumer feed subscriptions
    16. 1.4.5 Freemium business models
    17. 1.4.6 In-app purchase models
    18. 1.5 Nonsubscription churn scenarios
    19. 1.5.1 Inactivity as churn
    20. 1.5.2 Free trial conversion
    21. 1.5.3 Upsell/down sell
    22. 1.5.4 Other yes/no (binary) customer predictions
    23. 1.5.5 Customer activity predictions
    24. 1.5.6 Use cases that are not like churn
    25. 1.6 Customer behavior data
    26. 1.6.1 Customer events in common product categories
    27. 1.6.2 The most important events
    28. 1.7 Case studies in fighting churn
    29. 1.7.1 Klipfolio
    30. 1.7.2 Broadly
    31. 1.7.3 Versature
    32. 1.7.4 Social network simulation
    33. 1.8 Case studies in great customer metrics
    34. 1.8.1 Utilization
    35. 1.8.2 Success rates
    36. 1.8.3 Unit cost
    37. Summary
  8. 2 Measuring churn
    1. 2.1 Definition of the churn rate
    2. 2.1.1 Calculating the churn rate and retention rate
    3. 2.1.2 The relationship between churn rate and retention rate
    4. 2.2 Subscription databases
    5. 2.3 Basic churn calculation: Net retention
    6. 2.3.1 Net retention calculation
    7. 2.3.2 SQL net retention calculation
    8. 2.3.3 Interpreting net retention
    9. 2.4 Standard account-based churn
    10. 2.4.1 Standard churn rate definition
    11. 2.4.2 Outer joins for churn calculation
    12. 2.4.3 Standard churn calculation with SQL
    13. 2.4.4 When to use the standard churn rate
    14. 2.5 Activity (event-based) churn for nonsubscription products
    15. 2.5.1 Defining an active account and churn from events
    16. 2.5.2 Activity churn calculations with SQL
    17. 2.6 Advanced churn: Monthly recurring revenue (MRR) churn
    18. 2.6.1 MRR churn definition and calculation
    19. 2.6.2 MRR churn calculation with SQL
    20. 2.6.3 MRR churn vs. account churn vs. net (retention) churn
    21. 2.7 Churn rate measurement conversion
    22. 2.7.1 Survivor analysis (advanced)
    23. 2.7.2 Churn rate conversions
    24. 2.7.3 Converting any churn measurement window in SQL
    25. 2.7.4 Picking the churn measurement window
    26. 2.7.5 Seasonality and churn rates
    27. Summary
  9. 3 Measuring customers
    1. 3.1 From events to metrics
    2. 3.2 Event data warehouse schema
    3. 3.3 Counting events in one time period
    4. 3.4 Details of metric period definitions
    5. 3.4.1 Weekly behavioral cycles
    6. 3.4.2 Timestamps for metric measurements
    7. 3.5 Making measurements at different points in time
    8. 3.5.1 Overlapping measurement windows
    9. 3.5.2 Timing metric measurements
    10. 3.5.3 Saving metric measurements
    11. 3.5.4 Saving metrics for the simulation examples
    12. 3.6 Measuring totals and averages of event properties
    13. 3.7 Metric quality assurance
    14. 3.7.1 Testing how metrics change over time
    15. 3.7.2 Metric quality assurance (QA) case studies
    16. 3.7.3 Checking how many accounts receive metrics
    17. 3.8 Event QA
    18. 3.8.1 Checking how events change over time
    19. 3.8.2 Checking events per account
    20. 3.9 Selecting the measurement period for behavioral measurements
    21. 3.10 Measuring account tenure
    22. 3.10.1 Account tenure definition
    23. 3.10.2 Recursive table expressions for account tenure
    24. 3.10.3 Account tenure SQL program
    25. 3.11 Measuring MRR and other subscription metrics
    26. 3.11.1 Calculating MRR as a metric
    27. 3.11.2 Subscriptions for specific amounts
    28. 3.11.3 Calculating subscription unit quantities as metrics
    29. 3.11.4 Calculating the billing period as a metric
    30. Summary
  10. 4 Observing renewal and churn
    1. 4.1 Introduction to datasets
    2. 4.2 How to observe customers
    3. 4.2.1 Observation lead time
    4. 4.2.2 Observing sequences of renewals and a churn
    5. 4.2.3 Overview of creating a dataset from subscriptions
    6. 4.3 Identifying active periods from subscriptions
    7. 4.3.1 Active periods
    8. 4.3.2 Schema for storing active periods
    9. 4.3.3 Finding active periods that are ongoing
    10. 4.3.4 Finding active periods ending in churn
    11. 4.4 Identifying active periods for nonsubscription products
    12. 4.4.1 Active period definition
    13. 4.4.2 Process for forming datasets from events
    14. 4.4.3 SQL for calculating active weeks
    15. 4.5 Picking observation dates
    16. 4.5.1 Balancing churn and nonchurn observations
    17. 4.5.2 Observation date-picking algorithm
    18. 4.5.3 Observation date SQL program
    19. 4.6 Exporting a churn dataset
    20. 4.6.1 Dataset creation SQL program
    21. 4.7 Exporting the current customers for segmentation
    22. 4.7.1 Selecting active accounts and metrics
    23. 4.7.2 Segmenting customers by their metrics
    24. Summary
  11. Part 2. Waging the war
  12. 5 Understanding churn and behavior with metrics
    1. 5.1 Metric cohort analysis
    2. 5.1.1 The idea behind cohort analysis
    3. 5.1.2 Cohort analysis with Python
    4. 5.1.3 Cohorts of product use
    5. 5.1.4 Cohorts of account tenure
    6. 5.1.5 Cohort analysis of billing period
    7. 5.1.6 Minimum cohort size
    8. 5.1.7 Significant and insignificant cohort differences
    9. 5.1.8 Metric cohorts with a majority of zero customer metrics
    10. 5.1.9 Causality: Are the metrics causing churn?
    11. 5.2 Summarizing customer behavior
    12. 5.2.1 Understanding the distribution of the metrics
    13. 5.2.2 Calculating dataset summary statistics in Python
    14. 5.2.3 Screening rare metrics
    15. 5.2.4 Involving the business in data quality assurance
    16. 5.3 Scoring metrics
    17. 5.3.1 The idea behind metric scores
    18. 5.3.2 The metric score algorithm
    19. 5.3.3 Calculating metric scores in Python
    20. 5.3.4 Cohort analysis with scored metrics
    21. 5.3.5 Cohort analysis of monthly recurring revenue
    22. 5.4 Removing unwanted or invalid observations
    23. 5.4.1 Removing nonpaying customers from churn analysis
    24. 5.4.2 Removing observations based on metric thresholds in Python
    25. 5.4.3 Removing zero measurements from rare metric analyses
    26. 5.4.4 Disengaging behaviors: Metrics associated with increasing churn
    27. 5.5 Segmenting customers by using cohort analysis
    28. 5.5.1 Segmenting process
    29. 5.5.2 Choosing segment criteria
    30. Summary
  13. 6 Relationships between customer behaviors
    1. 6.1 Correlation between behaviors
    2. 6.1.1 Correlation between pairs of metrics
    3. 6.1.2 Investigating correlations with Python
    4. 6.1.3 Understanding correlations between sets of metrics with correlation matrices
    5. 6.1.4 Case study correlation matrices
    6. 6.1.5 Calculating correlation matrices in Python
    7. 6.2 Averaging groups of behavioral metrics
    8. 6.2.1 Why you average correlated metric scores
    9. 6.2.2 Averaging scores with a matrix of weights (loading matrix)
    10. 6.2.3 Case study for loading matrices
    11. 6.2.4 Applying a loading matrix in Python
    12. 6.2.5 Churn cohort analysis on metric group average scores
    13. 6.3 Discovering groups of correlated metrics
    14. 6.3.1 Grouping metrics by clustering correlations
    15. 6.3.2 Clustering correlations in Python
    16. 6.3.3 Loading matrix weights that make the average of scores a score
    17. 6.3.4 Running the metric grouping and grouped cohort analysis listings
    18. 6.3.5 Picking the correlation threshold for clustering
    19. 6.4 Explaining correlated metric groups to businesspeople
    20. Summary
  14. 7 Segmenting customers with advanced metrics
    1. 7.1 Ratio metrics
    2. 7.1.1 When to use ratio metrics and why
    3. 7.1.2 How to calculate ratio metrics
    4. 7.1.3 Ratio metric case study examples
    5. 7.1.4 Additional ratio metrics for the simulated social network
    6. 7.2 Percentage of total metrics
    7. 7.2.1 Calculating percentage of total metrics
    8. 7.2.2 Percentage of total metric case study with two metrics
    9. 7.2.3 Percentage of total metrics case study with multiple metrics
    10. 7.3 Metrics that measure change
    11. 7.3.1 Measuring change in the level of activity
    12. 7.3.2 Scores for metrics with extreme outliers (fat tails)
    13. 7.3.3 Measuring the time since the last activity
    14. 7.4 Scaling metric time periods
    15. 7.4.1 Scaling longer metrics to shorter quoting periods
    16. 7.4.2 Estimating metrics for new accounts
    17. 7.5 User metrics
    18. 7.5.1 Measuring active users
    19. 7.5.2 Active user metrics
    20. 7.6 Which ratios to use
    21. 7.6.1 Why use ratios, and what else is there?
    22. 7.6.2 Which ratios to use?
    23. Summary
  15. Part 3. Special weapons and tactics
  16. 8 Forecasting churn
    1. 8.1 Forecasting churn with a model
    2. 8.1.1 Probability forecasts with a model
    3. 8.1.2 Engagement and retention probability
    4. 8.1.3 Engagement and customer behavior
    5. 8.1.4 An offset matches observed churn rates to the S curve
    6. 8.1.5 The logistic regression probability calculation
    7. 8.2 Reviewing data preparation
    8. 8.3 Fitting a churn model
    9. 8.3.1 Results of logistic regression
    10. 8.3.2 Logistic regression code
    11. 8.3.3 Explaining logistic regression results
    12. 8.3.4 Logistic regression case study
    13. 8.3.5 Calibration and historical churn probabilities
    14. 8.4 Forecasting churn probabilities
    15. 8.4.1 Preparing the current customer dataset for forecasting
    16. 8.4.2 Preparing the current customer data for segmenting
    17. 8.4.3 Forecasting with a saved model
    18. 8.4.4 Forecasting case studies
    19. 8.4.5 Forecast calibration and forecast drift
    20. 8.5 Pitfalls of churn forecasting
    21. 8.5.1 Correlated metrics
    22. 8.5.2 Outliers
    23. 8.6 Customer lifetime value
    24. 8.6.1 The meaning(s) of CLV
    25. 8.6.2 From churn to expected customer lifetime
    26. 8.6.3 CLV formulas
    27. Summary
  17. 9 Forecast accuracy and machine learning
    1. 9.1 Measuring the accuracy of churn forecasts
    2. 9.1.1 Why you don’t use the standard accuracy measurement for churn
    3. 9.1.2 Measuring churn forecast accuracy with the AUC
    4. 9.1.3 Measuring churn forecast accuracy with the lift
    5. 9.2 Historical accuracy simulation: Backtesting
    6. 9.2.1 What and why of backtesting
    7. 9.2.2 Backtesting code
    8. 9.2.3 Backtesting considerations and pitfalls
    9. 9.3 The regression control parameter
    10. 9.3.1 Controlling the strength and number of regression weights
    11. 9.3.2 Regression with the control parameter
    12. 9.4 Picking the regression parameter by testing (cross-validation)
    13. 9.4.1 Cross-validation
    14. 9.4.2 Cross-validation code
    15. 9.4.3 Regression cross-validation case studies
    16. 9.5 Forecasting churn risk with machine learning
    17. 9.5.1 The XGBoost learning model
    18. 9.5.2 XGBoost cross-validation
    19. 9.5.3 Comparison of XGBoost accuracy to regression
    20. 9.5.4 Comparison of advanced and basic metrics
    21. 9.6 Segmenting customers with machine learning forecasts
    22. Summary
  18. 10 Churn demographics and firmographics
    1. 10.1 Demographic and firmographic datasets
    2. 10.1.1 Types of demographic and firmographic data
    3. 10.1.2 Account data model for the social network simulation
    4. 10.1.3 Demographic dataset SQL
    5. 10.2 Churn cohorts with demographic and firmographic categories
    6. 10.2.1 Churn rate cohorts for demographic categories
    7. 10.2.2 Churn rate confidence intervals
    8. 10.2.3 Comparing demographic cohorts with confidence intervals
    9. 10.3 Grouping demographic categories
    10. 10.3.1 Representing groups with a mapping dictionary
    11. 10.3.2 Cohort analysis with grouped categories
    12. 10.3.3 Designing category groups
    13. 10.4 Churn analysis for date- and numeric-based demographics
    14. 10.5 Churn forecasting with demographic data
    15. 10.5.1 Converting text fields to dummy variables
    16. 10.5.2 Forecasting churn with categorical dummy variables alone
    17. 10.5.3 Combining dummy variables with numeric data
    18. 10.5.4 Forecasting churn with demographic and metrics combined
    19. 10.6 Segmenting current customers with demographic data
    20. Summary
  19. 11 Leading the fight against churn
    1. 11.1 Planning your own fight against churn
    2. 11.1.1 Data processing and analysis checklist
    3. 11.1.2 Communication to the business checklist
    4. 11.2 Running the book listings on your own data
    5. 11.2.1 Loading your data into this book’s data schema
    6. 11.2.2 Running the listings on your own data
    7. 11.3 Porting this book’s listings to different environments
    8. 11.3.1 Porting the SQL listings
    9. 11.3.2 Porting the Python listings
    10. 11.4 Learning more and keeping in touch
    11. 11.4.1 Author’s blog site and social media
    12. 11.4.2 Sources for churn benchmark information
    13. 11.4.3 Other sources of information about churn
    14. 11.4.4 Products that help with churn
    15. Summary
  20. index
3.145.16.90