Spark ML pipelines

MLlib's goal is to make practical machine learning (ML) scalable and easy. Spark introduced the pipeline API for the easy creation and tuning of practical ML pipelines. As discussed previously, extracting meaningful knowledge through feature engineering in an ML pipeline creation involves a sequence of data collection, preprocessing, feature extraction, feature selection, model fitting, validation, and model evaluation stages. For example, classifying the text documents might involve text segmentation and cleaning, extracting features, and training a classification model with cross-validation toward tuning. Most ML libraries are not designed for distributed computation or they do not provide native support for pipeline creation and tuning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.170.188