Chapter 7. Going Further

In this chapter, we will cover the following recipes:

  • Using Spark Streaming to subscribe to a Twitter stream
  • Using Spark as an ETL tool (pulling data from ElasticSearch and publishing it to Kafka)
  • Using StreamingLogisticRegression to classify a Twitter stream using Kafka as a training stream
  • Using GraphX to analyze Twitter data
  • Watching other Scala libraries of interest

Introduction

So far, the entire book has concentrated a little around Breeze and a lot around Spark, specifically DataFrames and machine learning. However, there are a whole lot of other libraries, both in Java and Scala that could be leveraged while analyzing data from Scala. This chapter goes a little more into Spark's other components, streaming and GraphX. Note that each recipe in this chapter feeds into the next recipe.

Note

All the code related to this chapter can be downloaded from https://github.com/arunma/ScalaDataAnalysisCookbook/tree/master/chapter7-goingfurther.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.41.205