Many enterprises are investing in a next-generation data lake, hoping to democratize data at scale to provide business insights and ultimately make automated intelligent decisions. In this practical book, author Zhamak Dehghani reveals that, despite the time, money, and effort poured into them, data warehouses and data lakes fail when applied at the scale and speed of today's organizations. A distributed data mesh is a better choice.

Dehghani guides architects, technical leaders, and decision makers on their journey from monolithic big data architecture to a sociotechnical paradigm that draws from modern distributed architecture. A data mesh considers domains as a first-class concern, applies platform thinking to create self-serve data infrastructure, treats data as a product, and introduces a federated and computational model of data governance. This book shows you why and how.

  • Examine the current data landscape from the perspective of business and organizational needs, environmental challenges, and existing architectures
  • Analyze the landscape's underlying characteristics and failure modes
  • Get a complete introduction to data mesh principles and its constituents
  • Learn how to design a data mesh architecture
  • Move beyond a monolithic data lake to a distributed data mesh

Table of Contents

  1. I. Why Data Mesh?
  2. 1. The Inflection Point
    1. Great Expectations of Data
    2. The Great Divide of Data
    3. Operational Data
    4. Analytical Data
    5. Analytical and Operational Data Misintegration
    6. Scale, Encounter of a New Kind
    7. Beyond Order
    8. Approaching the Plateau of Return
    9. Recap
  3. 2. After The Inflection Point
    1. Embrace Change in a Complex, Volatile and Uncertain Business Environment
    2. Align Business, Tech and Now Analytical Data
    3. Close The Gap Between Analytical and Operational Data
    4. Localize Data Change to Business Domains
    5. Reduce Accidental Complexity of Pipelines and Copying Data
    6. Sustain Agility in the Face of Growth
    7. Remove Centralized and Monolithic Bottlenecks of the Lake or the Warehouse
    8. Reduce Coordination of Data Pipelines
    9. Reduce Coordination of Data Governance
    10. Enable Autonomy
    11. Increase the Ratio of Value from Data to Investment
    12. Abstract Technical Complexity with a Data Platform
    13. Embed Product Thinking Everywhere
    14. Go Beyond The Boundaries
    15. Recap
  4. 3. Before The Inflection Point
    1. Evolution of Analytical Data Architectures
    2. First Generation: Data Warehouse Architecture
    3. Second Generation: Data Lake Architecture
    4. Third Generation: Multimodal Cloud Architecture
    5. Characteristics of Analytical Data Architecture
    6. Monolithic
    7. Monolithic Architecture
    8. Monolithic Technology
    9. Monolithic Organization
    10. The complicated monolith
    11. Technically-Partitioned Architecture
    12. Activity-oriented Team Decomposition
    13. Recap
  5. II. What is Data Mesh
  6. 4. Principle of Domain ownership
    1. Apply DDD’s Strategic Design to Data
    2. Domain Data Archetypes
    3. Source-aligned Domain Data
    4. Aggregate Domain Data
    5. Consumer-aligned Domain Data
    6. Transition to Domain Ownership
    7. Push Data Ownership Upstream
    8. Define Multiple Connected Models
    9. Embrace the Most Relevant Domain, and Don’t Expect the Single Source of Truth
    10. Hide the Data Pipelines as Domains’ Internal Implementation
    11. Recap
  7. Prospective Table of Contents (Subject to Change)
    1. Part I : Why Data Mesh?
    2. Chapter 1: The Inflection Point
    3. Chapter 2: After the Inflection Point
    4. Chapter 3: Before The Inflection Point
    5. Part II: What Is Data Mesh?
    6. Chapter 4: Principle of Domain Ownership
    7. Chapter 5: Principle of Data as a Product
    8. Chapter 6: Principle of Self-Serve Data Platform
    9. Chapter 7: Principle of Federated Computational Governance
    10. Part III: How to Design Data Mesh Architecture?
    11. Chapter 8: The Logical Architecture
    12. Chapter 9: Data Product Quantum Blueprint
    13. Chapter 10: The Multi-Plane Data Platform
    14. Part IV: How to Get Started With Data Mesh
    15. Chapter 11: Execution Model
    16. Chapter 12: Organization Design
    17. Chapter 13: What Comes Next