Azure Storage, Streaming, and Batch Analytics

Author Richard Nuckolls

Release Date: 2020/10/01

Topic:

23
Chapters

0-1
Hours read

0k
Total Words

Start Reading Now
Add to Wishlist
View table of contents

Book Description

Azure Storage, Streaming, and Batch Analytics shows you how to build state-of-the-art data solutions with tools from the Microsoft Azure platform. Read along to construct a cloud-native data warehouse, adding features like real-time data processing. Based on the Lambda architecture for big data, the design uses scalable services such as Event Hubs, Stream Analytics, and SQL databases. Along the way, you’ll cover most of the topics needed to earn an Azure data engineering certification.

Azure Storage, Streaming, and Batch Analytic
Copyright
dedication
brief contents
contents
front matter
1. preface
2. acknowledgements
3. about this book
4. about the author
5. about the cover illustration
1 What is data engineering?
1. 1.1 What is data engineering?
2. 1.2 What do data engineers do?
3. 1.3 How does Microsoft define data engineering?
4. 1.4 What tools does Azure provide for data engineering?
5. 1.5 Azure Data Engineers
6. 1.6 Example application
7. Summary
2 Building an analytics system in Azure
1. 2.1 Fundamentals of Azure architecture
2. 2.2 Lambda architecture
3. 2.3 Azure cloud services
4. 2.4 Walk-through of processing a series of event data records
5. 2.5 Calculating cloud hosting costs
6. Summary
3 General storage with Azure Storage accounts
1. 3.1 Cloud storage services
  1. 3.1.1 Before you begin
2. 3.2 Creating an Azure Storage account
3. 3.3 Storage account services
4. 3.4 Storage account access
  1. 3.4.1 Blob container security
  2. 3.4.2 Designing Storage account access
5. 3.5 Exercises
  1. 3.5.1 Exercise 1
  2. 3.5.2 Exercise 2
6. Summary
4 Azure Data Lake Storage
1. 4.1 Create an Azure Data Lake store
  1. 4.1.1 Using Azure Portal
  2. 4.1.2 Using Azure PowerShell
2. 4.2 Data Lake store access
3. 4.3 Storage folder structure and data drift
  1. 4.3.1 Hierarchy structure revisited
  2. 4.3.2 Data drift
4. 4.4 Copy tools for Data Lake stores
5. 4.5 Exercises
  1. 4.5.1 Exercise 1
  2. 4.5.2 Exercise 2
6. Summary
5 Message handling with Event Hubs
1. 5.1 How does an Event Hub work?
2. 5.2 Collecting data in Azure
3. 5.3 Create an Event Hubs namespace
4. 5.4 Creating an Event Hub
5. 5.5 Event Hub partitions
6. 5.6 Configuring Capture
7. 5.7 Securing access to Event Hubs
  1. 5.7.1 Shared Access Signature policies
  2. 5.7.2 Writing to Event Hubs
8. 5.8 Exercises
9. Summary
6 Real-time queries with Azure Stream Analytics
1. 6.1 Creating a Stream Analytics service
2. 6.2 Configuring inputs and outputs
  1. 6.2.1 Event Hub job input
  2. 6.2.2 ASA job outputs
3. 6.3 Creating a job query
4. 6.4 Writing job queries
  1. 6.4.1 Window functions
  2. 6.4.2 Machine learning functions
5. 6.5 Managing performance
  1. 6.5.1 Streaming units
  2. 6.5.2 Event ordering
6. 6.6 Exercises
  1. 6.6.1 Exercise 1
  2. 6.6.2 Exercise 2
7. Summary
7 Batch queries with Azure Data Lake Analytics
1. 7.1 U-SQL language
2. 7.2 U-SQL jobs
3. 7.3 Creating a Data Lake Analytics service
  1. 7.3.1 Using Azure portal
  2. 7.3.2 Using Azure PowerShell
4. 7.4 Submitting jobs to ADLA
  1. 7.4.1 Using Azure portal
  2. 7.4.2 Using Azure PowerShell
5. 7.5 Efficient U-SQL job executions
6. 7.6 Using Blob Storage
7. 7.7 Exercises
  1. 7.7.1 Exercise 1
  2. 7.7.2 Exercise 2
8. Summary
8 U-SQL for complex analytics
1. 8.1 Data Lake Analytics Catalog
2. 8.2 Window functions
3. 8.3 Local C# functions
4. 8.4 Exercises
  1. 8.4.1 Exercise 1
  2. 8.4.2 Exercise 2
5. Summary
9 Integrating with Azure Data Lake Analytics
1. 9.1 Processing unstructured data
2. 9.2 Reading different file types
3. 9.3 Connecting to remote sources
4. 9.4 Exercises
  1. 9.4.1 Exercise 1
  2. 9.4.2 Exercise 2
5. Summary
10 Service integration with Azure Data Factory
1. 10.1 Creating an Azure Data Factory service
2. 10.2 Secure authentication
  1. 10.2.1 Azure Active Directory integration
  2. 10.2.2 Azure Key Vault
3. 10.3 Copying files with ADF
4. 10.4 Running an ADLA job
  1. 10.4.1 Creating an ADLA linkedservice
  2. 10.4.2 Creating a pipeline and activity
5. 10.5 Exercises
  1. 10.5.1 Exercise 1
  2. 10.5.2 Exercise 2
6. Summary
11 Managed SQL with Azure SQL Database
1. 11.1 Creating an Azure SQL Database
  1. 11.1.1 Create a SQL Server and SQLDB
2. 11.2 Securing SQLDB
3. 11.3 Availability and recovery
4. 11.4 Optimizing costs for SQLDB
5. 11.5 Exercises
6. Summary
12 Integrating Data Factory with SQL Database
1. 12.1 Before you begin
2. 12.2 Importing data with external data sources
3. 12.3 Importing file data with ADF
4. 12.4 Exercises
5. Summary
13 Where to go next
1. 13.1 Data catalog
2. 13.2 Version control and backups
3. 13.3 Microsoft certifications
4. 13.4 Signing off
5. Summary
appendix A. Setting up Azure services through PowerShell
1. A.1 Setting up Azure PowerShell
2. A.2 Create a subscription
3. A.3 Azure naming conventions
4. A.4 Setting up common Azure resources using PowerShell
5. A.5 Setting up Azure services using PowerShell
appendix B. Configuring the Jonestown Sluggers analytics system
1. B.1 Solution design
  1. B.1.1 Hot path
  2. B.1.2 Cold path
2. B.2 Naming convention
3. B.3 Creation script
4. B.4 Configure Azure services using PowerShell
5. B.5 Load event data
6. B.6 Output of batch and stream processing
7. B.7 Removing services
index

Azure Storage, Streaming, and Batch Analytics

Book Description

Table of Contents