Learning Machine Learning - Spark MLlib and Spark ML

"Each of us, actually every animal, is a data scientist. We collect data from our sensors, and then we process the data to get abstract rules to perceive our environment and control our actions in that environment to minimize pain and/or maximize pleasure. We have memory to store those rules in our brains, and then we recall and use them when needed. Learning is lifelong; we forget rules when they no longer apply or revise them when the environment changes."

- Ethem Alpaydin, Machine Learning: The New AI

The purpose of this chapter is to provide a conceptual introduction to statistical machine learning (ML) techniques for those who might not normally be exposed to such approaches during their typical required statistical training. This chapter also aims to take a newcomer from having minimal knowledge of machine learning all the way to being a knowledgeable practitioner in a few steps. We will focus on Spark's machine learning APIs, called Spark MLlib and ML, in theoretical and practical ways. Furthermore, we will provide some examples covering feature extraction and transformation, dimensionality reduction, regression, and classification analysis. In a nutshell, we will cover the following topics in this chapter:

  • Introduction to machine learning
  • Spark machine learning APIs
  • Feature extractor and transformation
  • Dimensionality reduction using PCA for regression
  • Binary and multiclass classification
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.183.152