Dimensionality reduction to improve performance

When we handle large volumes of data, some issues occur spontaneously. How does one build a representative model of a set of hundreds of variables? How does one view data across countless dimensions? To address these issues, we must adopt a series of techniques called dimensionality reduction. Dimensionality reduction is the process of converting a set of data with many variables into data with lesser dimensions while ensuring similar information. The aim is to reduce the number of dimensions in a dataset through either feature selection or feature extraction without significant loss of details. Feature selection approaches try to find a subset of the original variables. Feature extraction reduces the dimensionality of the data by transforming it into new features.

Dimensionality reduction techniques are used to reduce two undesirable characteristics in data, namely noise (high variance values) and redundancy (highly correlated variables). These techniques help identify sets of unrelated predictive variables that can be used in subsequent analyses. Reducing a high-dimensional dataset, that is a dataset with many predictive variables, to one with fewer dimensions improves conceptualization. Preceding three dimensions, visualizing the data becomes difficult or impossible.

Table of Contents for Dimensionality reduction to improve performance

Create new playlist

Sign In

Sign Up

Table of Contents for
Dimensionality reduction to improve performance