With logistic regression

The following example shows how to implement the multi-class classification with Shark-ML and the logistic regression algorithm. The following code snippet introduces a function declaration for this kind of task:

void LRClassification(const ClassificationDataset& train,
const ClassificationDataset& test,
unsigned int num_classes) {
...
}

The following code snippet shows how to configure an object for multi-class classification:

OneVersusOneClassifier<RealVector> ovo;
unsigned int pairs = num_classes * (num_classes - 1) / 2;
std::vector<LinearClassifier<RealVector> > lr(pairs);

for (std::size_t n = 0, cls1 = 1; cls1 < num_classes; cls1++) {
using BinaryClassifierType =
OneVersusOneClassifier<RealVector>::binary_classifier_type;
std::vector<BinaryClassifierType*> ovo_classifiers;
for (std::size_t cls2 = 0; cls2 < cls1; cls2++, n++) {
// get the binary subproblem
ClassificationDataset binary_cls_data =
binarySubProblem(train, cls2, cls1);

// train the binary machine
LogisticRegression<RealVector> trainer;
trainer.train(lr[n], binary_cls_data);
ovo_classifiers.push_back(&lr[n]);
}
ovo.addClass(ovo_classifiers);
}

In the previous code snippet, we used the following steps, which showed us how to configure an object for multi-class classification:

  1. Firstly, we defined the ovo object of the OneVersusOneClassifier class, which encapsulates the single multi-class classifier.
  2. Then, we initialized all binary classifiers for the one-versus-one strategy and placed them in the lr container object of the std::vector<LinearClassifier<RealVector>> type.
  3. We then trained the set of binary classifiers with the trainer object of the LogisticRegression type and put them into the lr container.
  4. We then ran the training with nested cycles over all classes. Notice that the lr container holds the instances of classifiers, but the ovo object needs pointers to classifiers' instances to perform the final classification. The ovo_classifiers object contains the pointers to binary classifiers. These classifiers are configured in such a way that each of them classifies a single class (cls1) as positive, and all other classes are treated as negative ( cls2 ).
  5. We then used the ovo_classifiers object to populate the ovo object, using the addClass method.

Another important factor is how we separate the data needed for training a single binary classifier. The Shark-ML library has a particular function for this task called binarySubProblem, which takes the object of the ClassificationDataset type and splits it in a way that is suitable for binary classification, even if the original dataset is a multi-class one. The second and the third arguments of this function are the zero class label index and the one class label index respectively.

After we trained all binary classifiers and configured the OneVersusOneClassifier object, we used it for model evaluation on a test set. This object can be used as a functor to classify the set of test examples, but they need to have the UnlabeledData type. In our example, the test dataset has the ClassificationDataset type, so it is labeled. We used the inputs() method to retrieve unlabeled samples from it. The result of the classification has the Data<unsigned int> type.

The following code snippet shows how to use a trained ovo object for evaluation:

// estimate accuracy
ZeroOneLoss<unsigned int> loss;
Data<unsigned int> output = ovo(test.inputs());
double accuracy = 1. - loss.eval(test.labels(), output);

// process results
for (std::size_t i = 0; i != test.numberOfElements(); i++) {
auto cluser_idx = output.element(i);
auto element = test.inputs().element(i);
...
}

For an evaluation metric, we used the object of the ZeroOneLoss type, which returns the value opposite to the accuracy, therefore we inverted it for our purposes.

The following screenshot shows the results of applying the Shark-ML implementation of the logistic regression algorithm to our datasets:

You can see that the multi-class logistic regression algorithm implementation performs better than its implementation in the Shogun library. However, it made some errors in the Dataset 0 and Dataset 1 datasets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.107.193