0%

Discover how to build a cloud-based data warehouse at petabyte-scale that is burstable and built to scale for end-to-end analytical solutions

Key Features

  • Discover how to translate familiar data warehousing concepts into Redshift implementation
  • Use impressive Redshift features to optimize development, productionizing, and operations processes
  • Find out how to use advanced features such as concurrency scaling, Redshift Spectrum, and federated queries

Book Description

Amazon Redshift is a fully managed, petabyte-scale AWS cloud data warehousing service. It enables you to build new data warehouse workloads on AWS and migrate on-premises traditional data warehousing platforms to Redshift.

This book on Amazon Redshift starts by focusing on Redshift architecture, showing you how to perform database administration tasks on Redshift. You'll then learn how to optimize your data warehouse to quickly execute complex analytic queries against very large datasets. Because of the massive amount of data involved in data warehousing, designing your database for analytical processing lets you take full advantage of Redshift's columnar architecture and managed services. As you advance, you'll discover how to deploy fully automated and highly scalable extract, transform, and load (ETL) processes, which help minimize the operational efforts that you have to invest in managing regular ETL pipelines and ensure the timely and accurate refreshing of your data warehouse. Finally, you'll gain a clear understanding of Redshift use cases, data ingestion, data management, security, and scaling so that you can build a scalable data warehouse platform.

By the end of this Redshift book, you'll be able to implement a Redshift-based data analytics solution and have understood the best practice solutions to commonly faced problems.

What you will learn

  • Use Amazon Redshift to build petabyte-scale data warehouses that are agile at scale
  • Integrate your data warehousing solution with a data lake using purpose-built features and services on AWS
  • Build end-to-end analytical solutions from data sourcing to consumption with the help of useful recipes
  • Leverage Redshift's comprehensive security capabilities to meet the most demanding business requirements
  • Focus on architectural insights and rationale when using analytical recipes
  • Discover best practices for working with big data to operate a fully managed solution

Who this book is for

This book is for anyone involved in architecting, implementing, and optimizing an Amazon Redshift data warehouse, such as data warehouse developers, data analysts, database administrators, data engineers, and data scientists. Basic knowledge of data warehousing, database systems, and cloud concepts and familiarity with Redshift will be beneficial.

Table of Contents

  1. Amazon Redshift Cookbook
  2. Foreword
  3. Contributors
  4. About the authors
  5. About the reviewers
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Share Your Thoughts
  7. Chapter 1: Getting Started with Amazon Redshift
    1. Technical requirements
    2. Creating an Amazon Redshift cluster using the AWS Console
    3. Getting ready
    4. How to do it…
    5. Creating an Amazon Redshift cluster using the AWS CLI
    6. Getting ready
    7. How to do it…
    8. How it works…
    9. Creating an Amazon Redshift cluster using an AWS CloudFormation template
    10. Getting ready
    11. How to do it…
    12. How it works…
    13. Connecting to an Amazon Redshift cluster using the Query Editor
    14. Getting ready
    15. How to do it…
    16. Connecting to an Amazon Redshift cluster using the SQL Workbench/J client
    17. Getting ready
    18. How to do it…
    19. Connecting to an Amazon Redshift Cluster using a Jupyter Notebook
    20. Getting ready
    21. How to do it…
    22. Connecting to an Amazon Redshift cluster using Python
    23. Getting ready
    24. How to do it…
    25. Connecting to an Amazon Redshift cluster programmatically using Java
    26. Getting ready
    27. How to do it…
    28. Connecting to an Amazon Redshift cluster programmatically using .NET
    29. Getting ready
    30. How to do it…
    31. Connecting to an Amazon Redshift cluster using the command line
    32. Getting ready
    33. How to do it…
  8. Chapter 2: Data Management
    1. Technical requirements
    2. Managing a database in an Amazon Redshift cluster
    3. Getting ready
    4. How to do it…
    5. Managing a schema in a database
    6. Getting ready
    7. How to do it…
    8. Managing tables
    9. Getting ready
    10. How to do it…
    11. How it works…
    12. Managing views
    13. Getting ready
    14. How to do it…
    15. Managing materialized views
    16. Getting ready
    17. How to do it…
    18. How it works…
    19. Managing stored procedures
    20. Getting ready
    21. How to do it…
    22. How it works…
    23. Managing UDFs
    24. Getting ready
    25. How to do it…
    26. How it works…
  9. Chapter 3: Loading and Unloading Data
    1. Technical requirements
    2. Loading data from Amazon S3 using COPY
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Loading data from Amazon EMR
    7. Getting ready
    8. How to do it…
    9. Loading data from Amazon DynamoDB
    10. Getting ready
    11. How to do it…
    12. How it works…
    13. Loading data from remote hosts
    14. Getting ready
    15. How to do it…
    16. Updating and inserting data
    17. Getting ready
    18. How to do it…
    19. Unloading data to Amazon S3
    20. Getting ready
    21. How to do it…
  10. Chapter 4: Data Pipelines
    1. Technical requirements
    2. Ingesting data from transactional sources using AWS DMS
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Streaming data to Amazon Redshift via Amazon Kinesis Firehose
    7. Getting ready
    8. How to do it…
    9. How it works…
    10. Cataloging and ingesting data using AWS Glue
    11. How to do it…
    12. How it works…
  11. Chapter 5: Scalable Data Orchestration for Automation
    1. Technical requirements
    2. Scheduling queries using the Amazon Redshift query editor
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Event-driven applications using Amazon EventBridge and the Amazon Redshift Data API
    7. Getting ready
    8. How to do it…
    9. How it works…
    10. Event-driven applications using AWS Lambda
    11. Getting ready
    12. How to do it…
    13. How it works…
    14. Orchestrating using AWS Step Functions
    15. Getting ready
    16. How to do it…
    17. How it works…
    18. Orchestrating using Amazon MWAA
    19. Getting ready
    20. How to do it…
    21. How it works…
  12. Chapter 6: Data Authorization and Security
    1. Technical requirements
    2. Managing infrastructure security
    3. Getting ready
    4. How to do it
    5. Data encryption at rest
    6. Getting ready
    7. How to do it
    8. Data encryption in transit
    9. Getting ready
    10. How to do it
    11. Column-level security
    12. Getting ready
    13. How to do it
    14. How it works
    15. Loading and unloading encrypted data
    16. Getting ready
    17. How to do it
    18. Managing superusers
    19. Getting ready
    20. How to do it
    21. Managing users and groups
    22. Getting ready
    23. How to do it
    24. Managing federated authentication
    25. Getting ready
    26. How to do it
    27. How it works
    28. Using IAM authentication to generate database user credentials
    29. Getting ready
    30. How to do it
    31. Managing audit logs
    32. Getting ready
    33. How to do it
    34. How it works
    35. Monitoring Amazon Redshift
    36. Getting ready
    37. How to do it
    38. How it works
  13. Chapter 7: Performance Optimization
    1. Technical requirements
    2. Amazon Redshift Advisor
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Managing column compression
    7. Getting ready
    8. How to do it…
    9. How it works…
    10. Managing data distribution
    11. Getting ready
    12. How to do it…
    13. How it works…
    14. Managing sort keys
    15. Getting ready
    16. How to do it…
    17. How it works…
    18. Analyzing and improving queries
    19. Getting ready
    20. How to do it…
    21. How it works…
    22. Configuring workload management (WLM)
    23. Getting ready
    24. How to do it…
    25. How it works…
    26. Utilizing Concurrency Scaling
    27. Getting ready
    28. How to do it…
    29. How it works…
    30. Optimizing Spectrum queries
    31. Getting ready
    32. How to do it…
    33. How it works…
  14. Chapter 8: Cost Optimization
    1. Technical requirements
    2. AWS Trusted Advisor
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Amazon Redshift Reserved Instance pricing
    7. Getting ready
    8. How to do it…
    9. Configuring pause and resume for an Amazon Redshift cluster
    10. Getting ready
    11. How to do it…
    12. Scheduling pause and resume
    13. Getting ready
    14. How to do it…
    15. How it works…
    16. Configuring Elastic Resize for an Amazon Redshift cluster
    17. Getting ready
    18. How to do it…
    19. Scheduling Elastic Resizing
    20. Getting ready
    21. How to do it…
    22. How it works…
    23. Using cost controls to set actions for Redshift Spectrum
    24. Getting ready
    25. How to do it…
    26. Using cost controls to set actions for Concurrency Scaling
    27. Getting ready
    28. How to do it…
  15. Chapter 9: Lake House Architecture
    1. Technical requirements
    2. Building a data lake catalog using AWS Lake Formation
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Exporting a data lake from Amazon Redshift
    7. Getting ready
    8. How to do it…
    9. Extending a data warehouse using Amazon Redshift Spectrum
    10. Getting ready
    11. How to do it…
    12. Data sharing across multiple Amazon Redshift clusters
    13. Getting ready
    14. How to do it…
    15. How it works…
    16. Querying operational sources using Federated Query
    17. Getting ready
    18. How to do it…
  16. Chapter 10: Extending Redshift's Capabilities
    1. Technical requirements
    2. Managing Amazon Redshift ML
    3. Getting ready
    4. How to do it…
    5. How it works…
    6. Visualizing data using Amazon QuickSight
    7. Getting ready
    8. How to do it…
    9. How it works…
    10. AppFlow for ingesting SaaS data in Redshift
    11. Getting ready
    12. How to do it…
    13. How it works…
    14. Data wrangling using DataBrew
    15. Getting ready
    16. How to do it…
    17. How it works…
    18. Utilizing ElastiCache for sub-second latency
    19. Getting ready
    20. How to do it…
    21. How it works…
    22. Subscribing to third-party data using AWS Data Exchange
    23. Getting ready
    24. How to do it…
    25. How it works…
  17. Appendix
    1. Recipe 1 – Creating an IAM user
    2. Recipe 2 – Storing database credentials using Amazon Secrets Manager
    3. Recipe 3 – Creating an IAM role for an AWS service
    4. Recipe 4 – Attaching an IAM role to the Amazon Redshift cluster
    5. Why subscribe?
  18. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts
3.129.39.55