R squared

The R squared metric is also known as a coefficient of determination. It is used to measure how good our independent variables (features from the training set) describe the problem and explain the variability of dependent variables (prediction values). The higher values tell us that the model explains our data well enough, while lower values tell us that the model makes many errors. This is given by the following equations:

Here,  is the number of predictions and ground truth items,  is the ground truth value for the ith item, and  is the prediction value for the ith item.

The only problem with this metric is that the value will always increase as we add new independent variables to the model. It may seem that the model begins to explain data better, but this isn't true – this value only increases if there are more training items.
There are no out of the box functions for calculating this metric in the libraries we use. However, it is simple to implement it with linear algebra functions.

The following example shows how to calculate MSE with the Shark-ML library:

auto var = shark::variance(train_data.labels());
auto r_squared = 1 - mse / var(0);
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.247.68