Shark-ML example

The Shark-ML library also contains classes for the grid search algorithm. However, it does not have an implementation of the polynomial regression model, so we implemented this model and the code in this example. First, we define the partitions of our dataset, which are five chunks of the same size. The following example shows how to use the createCVSameSize function for this purpose:

const unsigned int num_folds = 5;
CVFolds<RegressionDataset> folds =
createCVSameSize<RealVector, RealVector>(train_data, num_folds);

As a result, we have the CVFolds<RegressionDataset> object, which contains our partition. Then, we initialize and configure our model. In the Shark-ML library, the model parameters are usually configured with trainer objects, which pass them to the model objects:

double regularization_factor = 0.0;
double polynomial_degree = 8;
int num_epochs = 300;
PolynomialModel<> model;
PolynomialRegression trainer(regularization_factor, polynomial_degree,
num_epochs);

Now that we have the trainer, model, and folds objects, we initialize the CrossValidationError object. As a performance metric, we used the AbsoluteLoss object, which implements the MAE metric:

AbsoluteLoss<> loss;
CrossValidationError<PolynomialModel<>, RealVector> cv_error(
folds, &trainer, &model, &trainer, &loss);

There is a class called GridSearch in the Shark-ML library that we can use to perform the grid search algorithm. We should configure the object of this class with the parameter ranges. There is the configure() method, which takes three containers as arguments. The first one specifies the minimum values for each parameter range, the second specifies the maximum values for each parameter range, and the third specifies the number of values in each parameter range. Notice that the order of the parameters in the range containers should be the same as how they were defined in the trainer class. Information about parameter order can be found in the appropriate documentation or source code:

GridSearch grid;
std::vector<double> min(2);
std::vector<double> max(2);
std::vector<size_t> sections(2);
// regularization factor
min[0] = 0.0;
max[0] = 0.00001;
sections[0] = 6;
// polynomial degree
min[1] = 4;
max[1] = 10.0;
sections[1] = 6;
grid.configure(min, max, sections);

After initializing the grid, we can use the step() method to perform the grid search for the best hyperparameter values; this method should be called only once. As in the previous example for the Shogun library, we have to retrain our model with the parameters we found:

grid.step(cv_error);

trainer.setParameterVector(grid.solution().point);
trainer.train(model, train_data)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.235.188