Home Page Icon
Home Page
Table of Contents for
Data Science on the Google Cloud Platform, 2nd Edition
Close
Data Science on the Google Cloud Platform, 2nd Edition
by Valliappa Lakshmanan
Data Science on the Google Cloud Platform, 2nd Edition
1. Making Better Decisions Based on Data
Many Similar Decisions
The Role of Data Scientists
Scrappy Environment
Full Stack Cloud Data Scientists
Collaboration
Target audience for the book
Best Practices
Simple to Complex Solutions
Cloud Computing
Serverless
A Probabilistic Decision
Probabilistic Approach
Probability Density Function
Cumulative Distribution Function
Data and Tools
Getting Started with the Code
Summary
2. Ingesting Data into the Cloud
Airline On-Time Performance Data
Knowability
Training–Serving Skew
Downloading Data
Hub and Spoke Architecture
Dataset Fields
Separation of Compute and Storage
Scaling Up
Scaling Out with Sharded Data
Scaling out with Data in Situ
Ingesting Data
Reverse Engineering a Web Form
Dataset Download
Exploration and Cleanup
Uploading Data to Google Cloud Storage
Loading Data into Google BigQuery
Advantages of a Serverless Columnar Database
Staging on Cloud Storage
Access Control
Ingesting CSV Files
Partitioning
Scheduling Monthly Downloads
Ingesting in Python
Cloud Run
Securing Cloud Run
Deploying and Invoking Cloud Run
Scheduling Cloud Run
Summary
Code Break
3. Creating Compelling Dashboards
Explain Your Model with Dashboards
Why Build a Dashboard First?
Accuracy, Honesty, and Good Design
Loading Data into Cloud SQL
Create a Google Cloud SQL Instance
Create Table of Data
Interacting with the database
Querying Using BigQuery
Schema Exploration
Using Preview
Using Table Explorer
Creating BigQuery View
Building Our First Model
Contingency Table
Threshold Optimization
Building a Dashboard
Getting Started with Data Studio
Creating Charts
Adding End-User Controls
Showing Proportions with a Pie Chart
Explaining a Contingency Table
Summary
4. Streaming Data: Publication and Ingest with Pub/Sub and Dataflow
Designing the Event Feed
Transformations Needed
Architecture
Getting airport information
Sharing data
Time Correction
Apache Beam/Cloud Dataflow
Parsing Airports Data
Adding Time Zone Information
Converting Times to UTC
Correcting Dates
Creating Events
Reading and Writing to the Cloud
Running the Pipeline in the Cloud
Publishing an Event Stream to Cloud Pub/Sub
Speed-up Factor
Get Records to Publish
Iterating Through Records
Building a Batch of Events
Publishing a Batch of Events
Real-Time Stream Processing
Streaming in Dataflow
Windowing a pipeline
Streaming aggregation
Using Event Timestamps
Executing the Stream Processing
Analyzing Streaming Data in BigQuery
Real-Time Dashboard
Summary
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Next
Next Chapter
Data Science on the Google Cloud Platform, 2nd Edition
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset