0%

Take a dive into data lakes 

“Data lakes” is the latest buzz word in the world of data storage, management, and analysis. Data Lakes For Dummies decodes and demystifies the concept and helps you get a straightforward answer the question: “What exactly is a data lake and do I need one for my business?” Written for an audience of technology decision makers tasked with keeping up with the latest and greatest data options, this book provides the perfect introductory survey of these novel and growing features of the information landscape. It explains how they can help your business, what they can (and can’t) achieve, and what you need to do to create the lake that best suits your particular needs.  

With a minimum of jargon, prolific tech author and business intelligence consultant Alan Simon explains how data lakes differ from other data storage paradigms. Once you’ve got the background picture, he maps out ways you can add a data lake to your business systems; migrate existing information and switch on the fresh data supply; clean up the product; and open channels to the best intelligence software for to interpreting what you’ve stored.  

  • Understand and build data lake architecture 
  • Store, clean, and synchronize new and existing data 
  • Compare the best data lake vendors 
  • Structure raw data and produce usable analytics  

Whatever your business, data lakes are going to form ever more prominent parts of the information universe every business should have access to. Dive into this book to start exploring the deep competitive advantage they make possible—and make sure your business isn’t left standing on the shore.  

Table of Contents

  1. Cover
  2. Title Page
  3. Copyright
  4. Introduction
    1. About This Book
    2. Foolish Assumptions
    3. Icons Used in This Book
    4. Beyond the Book
    5. Where to Go from Here
  5. Part 1: Getting Started with Data Lakes
    1. Chapter 1: Jumping into the Data Lake
    2. What Is a Data Lake?
    3. The Data Lake Olympics
    4. Data Lakes and Big Data
    5. The Data Lake Water Gets Murky
    6. Chapter 2: Planning Your Day (and the Next Decade) at the Data Lake
    7. Carpe Diem: Seizing the Day with Big Data
    8. Managing Equal Opportunity Data
    9. Building Today’s — and Tomorrow’s — Enterprise Analytical Data Environment
    10. Reducing Existing Stand-Alone Data Marts
    11. Eliminating Future Stand-Alone Data Marts
    12. Establishing a Migration Path for Your Data Warehouses
    13. Aligning Data with Decision Making
    14. Speedboats, Canoes, and Lake Cruises: Traversing the Variable-Speed Data Lake
    15. Managing Overall Analytical Costs
    16. Chapter 3: Break Out the Life Vests: Tackling Data Lake Challenges
    17. That’s Not a Data Lake, This Is a Data Lake!
    18. Exposing Data Lake Myths and Misconceptions
    19. Navigating Your Way through the Storm on the Data Lake
    20. Building the Data Lake of Dreams
    21. Performing Regular Data Lake Tune-ups — Or Else!
    22. Technology Marches Forward
  6. Part 2: Building the Docks, Avoiding the Rocks
    1. Chapter 4: Imprinting Your Data Lake on a Reference Architecture
    2. Playing Follow the Leader
    3. Guiding Principles of a Data Lake Reference Architecture
    4. A Reference Architecture for Your Data Lake Reference Architecture
    5. Incoming! Filling Your Data Lake
    6. Supporting the Fleet Sailing on Your Data Lake
    7. The Old Meets the New at the Data Lake
    8. Bringing Outside Water into Your Data Lake
    9. Playing at the Edge of the Lake
    10. Chapter 5: Anybody Hungry? Ingesting and Storing Raw Data in Your Bronze Zone
    11. Ingesting Data with the Best of Both Worlds
    12. Joining the Data Ingestion Fraternity
    13. Storing Data in Your Bronze Zone
    14. Just Passing Through: The Cross-Zone Express Lane
    15. Taking Inventory at the Data Lake
    16. Bringing Analytics to Your Bronze Zone
    17. Chapter 6: Your Data Lake’s Water Treatment Plant: The Silver Zone
    18. Funneling Data further into the Data Lake
    19. Bringing Master Data into Your Data Lake
    20. Impacting the Bronze Zone
    21. Getting Clever with Your Storage Options
    22. Working Hand-in-Hand with Your Gold Zone
    23. Chapter 7: Bottling Your Data Lake Water in the Gold Zone
    24. Laser-Focusing on the Purpose of the Gold Zone
    25. Looking Inside the Gold Zone
    26. Deciding What Data to Curate in Your Gold Zone
    27. Seeing What Happens When Your Curated Data Becomes Less Useful
    28. Chapter 8: Playing in the Sandbox
    29. Developing New Analytical Models in Your Sandbox
    30. Comparing Different Data Lake Architectural Options
    31. Experimenting and Playing Around with Data
    32. Chapter 9: Fishing in the Data Lake
    33. Starting with the Latest Guidebook
    34. Taking It Easy at the Data Lake
    35. Staying in Your Lane
    36. Doing a Little Bit of Exploring
    37. Putting on Your Gear and Diving Underwater
    38. Chapter 10: Rowing End-to-End across the Data Lake
    39. Keeping versus Discarding Data Components
    40. Getting Started with Your Data Lake
    41. Shifting Your Focus to Data Ingestion
    42. Finishing Up with the Sandbox
  7. Part 3: Evaporating the Data Lake into the Cloud
    1. Chapter 11: A Cloudy Day at the Data Lake
    2. Rushing to the Cloud
    3. Running through Some Cloud Computing Basics
    4. The Big Guys in the Cloud Computing Game
    5. Chapter 12: Building Data Lakes in Amazon Web Services
    6. The Elite Eight: Identifying the Essential Amazon Services
    7. Looking at the Rest of the Amazon Data Lake Lineup
    8. Building Data Pipelines in AWS
    9. Chapter 13: Building Data Lakes in Microsoft Azure
    10. Setting Up the Big Picture in Azure
    11. The Magnificent Seven, Azure Style
    12. Filling Out the Azure Data Lake Lineup
    13. Assembling the Building Blocks
  8. Part 4: Cleaning Up the Polluted Data Lake
    1. Chapter 14: Figuring Out If You Have a Data Swamp Instead of a Data Lake
    2. Designing Your Report Card and Grading System
    3. Looking at the Raw Data Lockbox
    4. Knowing What to Do When Your Data Lake Is Out of Order
    5. Too Fast, Too Slow, Just Right: Dealing with Data Lake Velocity and Latency
    6. Dividing the Work in Your Component Architecture
    7. Tallying Your Scores and Analyzing the Results
    8. Chapter 15: Defining Your Data Lake Remediation Strategy
    9. Setting Your Key Objectives
    10. Doing Your Gap Analysis
    11. Identifying Resolutions
    12. Establishing Timelines
    13. Defining Your Critical Success Factors
    14. Chapter 16: Refilling Your Data Lake
    15. The Three S’s: Setting the Stage for Success
    16. Refining and Enriching Existing Raw Data
    17. Making Better Use of Existing Refined Data
    18. Building New Pipelines with Newly Ingested Raw Data
  9. Part 5: Making Trips to the Data Lake a Tradition
    1. Chapter 17: Checking Your GPS: The Data Lake Road Map
    2. Getting an Overhead View of the Road to the Data Lake
    3. Assessing Your Current State of Data and Analytics
    4. Putting Together a Lofty Vision
    5. Building Your Data Lake Architecture
    6. Deciding on Your Kickoff Activities
    7. Expanding Your Data Lake
    8. Chapter 18: Booking Future Trips to the Data Lake
    9. Searching for the All-in-One Data Lake
    10. Spreading Artificial Intelligence Smarts throughout Your Data Lake
  10. Part 6: The Part of Tens
    1. Chapter 19: Top Ten Reasons to Invest in Building a Data Lake
    2. Supporting the Entire Analytics Continuum
    3. Bringing Order to Your Analytical Data throughout Your Enterprise
    4. Retiring Aging Data Marts
    5. Bringing Unfulfilled Analytics Ideas out of Dry Dock
    6. Laying a Foundation for Future Analytics
    7. Providing a Region for Experimentation
    8. Improving Your Master Data Efforts
    9. Opening Up New Business Possibilities
    10. Keeping Up with the Competition
    11. Getting Your Organization Ready for the Next Big Thing
    12. Chapter 20: Ten Places to Get Help for Your Data Lake
    13. Cloud Provider Professional Services
    14. Major Systems Integrators
    15. Smaller Systems Integrators
    16. Individual Consultants
    17. Training Your Internal Staff
    18. Industry Analysts
    19. Data Lake Bloggers
    20. Data Lake Groups and Forums
    21. Data-Oriented Associations
    22. Academic Resources
    23. Chapter 21: Ten Differences between a Data Warehouse and a Data Lake
    24. Types of Data Supported
    25. Data Volumes
    26. Different Internal Data Models
    27. Architecture and Topology
    28. ETL versus ELT
    29. Data Latency
    30. Analytical Uses
    31. Incorporating New Data Sources
    32. User Communities
    33. Hosting
  11. Index
  12. About the Author
  13. Connect with Dummies
  14. End User License Agreement
18.226.93.207