Home Page Icon
Home Page
Table of Contents for
References
Close
References
by Rajanarayanan Thottuvaikkatumana
Apache Spark 2 for Beginners
Apache Spark 2 for Beginners
Apache Spark 2 for Beginners
Credits
About the Author
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and more
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Spark Fundamentals
An overview of Apache Hadoop
Understanding Apache Spark
Installing Spark on your machines
Python installation
R installation
Spark installation
Development tool installation
Optional software installation
IPython
RStudio
Apache Zeppelin
References
Summary
2. Spark Programming Model
Functional programming with Spark
Understanding Spark RDD
Spark RDD is immutable
Spark RDD is distributable
Spark RDD lives in memory
Spark RDD is strongly typed
Data transformations and actions with RDDs
Monitoring with Spark
The basics of programming with Spark
MapReduce
Joins
More actions
Creating RDDs from files
Understanding the Spark library stack
Reference
Summary
3. Spark SQL
Understanding the structure of data
Why Spark SQL?
Anatomy of Spark SQL
DataFrame programming
Programming with SQL
Programming with DataFrame API
Understanding Aggregations in Spark SQL
Understanding multi-datasource joining with SparkSQL
Introducing datasets
Understanding Data Catalogs
References
Summary
4. Spark Programming with R
The need for SparkR
Basics of the R language
DataFrames in R and Spark
Spark DataFrame programming with R
Programming with SQL
Programming with R DataFrame API
Understanding aggregations in Spark R
Understanding multi-datasource joins with SparkR
References
Summary
5. Spark Data Analysis with Python
Charting and plotting libraries
Setting up a dataset
Data analysis use cases
Charts and plots
Histogram
Density plot
Bar chart
Stacked bar chart
Pie chart
Donut chart
Box plot
Vertical bar chart
Scatter plot
Enhanced scatter plot
Line graph
References
Summary
6. Spark Stream Processing
Data stream processing
Micro batch data processing
Programming with DStreams
A log event processor
Getting ready with the Netcat server
Organizing files
Submitting the jobs to the Spark cluster
Monitoring running applications
Implementing the application in Scala
Compiling and running the application
Handling the output
Implementing the application in Python
Windowed data processing
Counting the number of log event messages processed in Scala
Counting the number of log event messages processed in Python
More processing options
Kafka stream processing
Starting Zookeeper and Kafka
Implementing the application in Scala
Implementing the application in Python
Spark Streaming jobs in production
Implementing fault-tolerance in Spark Streaming data processing applications
Structured streaming
References
Summary
7. Spark Machine Learning
Understanding machine learning
Why Spark for machine learning?
Wine quality prediction
Model persistence
Wine classification
Spam filtering
Feature algorithms
Finding synonyms
References
Summary
8. Spark Graph Processing
Understanding graphs and their usage
The Spark GraphX library
GraphX overview
Graph partitioning
Graph processing
Graph structure processing
Tennis tournament analysis
Applying the PageRank algorithm
Connected component algorithm
Understanding GraphFrames
Understanding GraphFrames queries
References
Summary
9. Designing Spark Applications
Lambda Architecture
Microblogging with Lambda Architecture
An overview of SfbMicroBlog
Getting familiar with data
Setting the data dictionary
Implementing Lambda Architecture
Batch layer
Serving layer
Speed layer
Queries
Working with Spark applications
Coding style
Setting up the source code
Understanding data ingestion
Generating purposed views and queries
Understanding custom data processes
References
Summary
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Understanding custom data processes
Next
Next Chapter
Summary
References
For more information, visit the following links:
http://lambda-architecture.net/
https://www.dre.vanderbilt.edu/~schmidt/PDF/Context-Object-Pattern.pdf
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset