Dimensionality reduction

Dimensionality reduction is the process of reducing the number of variables under consideration. It can be used to extract latent features from raw and noisy features or to compress data while maintaining the structure. Spark MLlib provides support for dimensionality reduction on the RowMatrix class. The most commonly used algorithms for reducing the dimensionality of data are PCA and SVD. However, in this section, we will discuss PCA only to make the discussion more concrete.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.34.223