Model serialization with Shark-ML

The Shark-ML library has a unified API for serializing models of all kinds. Every model has the write and read methods for saving and loading model parameters, respectively. These methods take an instance of the boost::archive object as an input parameter.

Let's look at an example of model parameter serialization with the Shark-ML library. First, we generate training data for the linear regression model, as we did in the previous examples:

std::vector<RealVector> x_data(n);
std::vector<RealVector> y_data(n);

std::random_device rd;
std::mt19937 re(rd());
std::uniform_real_distribution<double> dist(-1.5, 1.5);

RealVector x_v(1);
RealVector y_v(1);
for (size_t i = 0; i < n; ++i) {
x_v(0) = i;
x_data[i] = x_v;

y_v(0) = func(i) + dist(re); // add noise
y_data[i] = y_v;
}

Data<RealVector> x = createDataFromRange(x_data);
Data<RealVector> y = createDataFromRange(y_data);
RegressionDataset data(x, y);

Here, we created two vectors, x_data and y_data, which contain predictor and target value objects of the RealVector type. Then, we made the x and the y objects of the Data type and placed them into the data object of the RegressionDataset type.

The following code shows how to train a linear model object with the dataset object we initialized previously:

LinearModel<> model;
LinearRegression trainer;
trainer.train(model, data);

Here, we trained the LinearModel object with the trainer of the LinearRegression type.

Now that we've trained the model, we can save its parameters in a file using the boost::archive::polymorphic_binary_oarchive object. The following code shows how to do this:

std::ofstream ofs("shark-linear.dat");
boost::archive::polymorphic_binary_oarchive oa(ofs);
model.write(oa);

The archive object, oa, was initialized with the ofs object of the std::ofstream type. The output stream type was chosen because we needed to save the model parameters.

The following code shows how to load saved model parameters:

 std::ifstream ifs("shark-linear.dat");
boost::archive::polymorphic_binary_iarchive ia(ifs);
LinearModel<> model;
model.read(ia);

We loaded the model parameters with the read method, which took the boost::archive::polymorphic_binary_iarchive object, which was initialized with the std::ifstream object. Notice that we created a new LinearModel object.

Instead of using binary serialization, the Shark-ML library allows us to use the boost::archive::polymorphic_text_oarchive and boost::archive::polymorphic_text_iarchive types to serialize to an ASCII text file.

The following code shows how to generate new test values so that we can check the model:

 std::vector<RealVector> new_x_data;
for (size_t i = 0; i < 5; ++i) {
new_x_data.push_back({static_cast<double>(i)});
std::cout << func(i) << std::endl;
}

The following code shows how to use the model for prediction purposes:

 auto prediction = model(createDataFromRange(new_x_data));
std::cout << "Predictions: " << prediction << std::endl;

The prediction was made with a call to the model's object functional operator.

In this section, we saw that the Shark-ML library has an API that we can use to save and load parameters but that it lacks the functions to save and load ML model architectures.

In the next section, we will look at the PyTorch library's serialization API.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.156.235