Normalizing with Shogun

The shogun::CRescaleFeatures class in the Shogun library implements min-max normalization (or rescaling). We can reuse objects of this class for scaling different data with the same learned statistics. It can be useful in cases when we train a machine learning algorithm on one data format with applied rescaling, and then we use the algorithm for predictions on new data. To make this algorithm work as we want, we have to rescale new data in the same way as we did in the training process, as follows:

include <shogun/preprocessor/RescaleFeatures.h>
...
auto features = shogun::some<shogun::CDenseFeatures<DataType>>(inputs);
...
auto scaler = shogun::wrap(new shogun::CRescaleFeatures());
scaler->fit(features); // learn statistics - min and max values
scaler->transform(features); // apply scaling

To learn statistics values, we use the fit() method, and for features modification, we use the transform() method of the CRescaleFeatures class.

We can print updated features with the display_vector() method of the SGVector class, as follows:

auto features_matrix = features->get_feature_matrix();
for (int i = 0; i < n; ++i) {
  std::cout << "Sample idx " << i << " ";
  features_matrix.get_column(i).display_vector();
}

Some algorithms in the Shogun library can perform normalization of input data as an internal step of their implementation, so we should read the documentation to determine if manual normalization is required.

Table of Contents for Normalizing with Shogun

Create new playlist

Sign In

Sign Up

Table of Contents for
Normalizing with Shogun