LDA

The LDA algorithms in the Shark-ML library are implemented in the LDA class. First, we have to train the algorithm with the train() method, which takes two parameters: the first one is a reference to the object of the LinearClassifier class, while the second is the dataset reference. Notice that the LDA algorithm uses objects of LinearClassifier because, in the Shark-ML library, LDA is used mostly for classification. Also, because this is a supervised algorithm, we should provide labels for the data. We can do this by initializing the LabeledData<RealVector, unsigned int> class object. In the following example, we can see how to combine UnlabeledData<RealVector> datasets with the labeled one. Note that labels should start from 0.

After the object of the LinearClassifier class has been trained, we can use it for data classification as the functional object. Its call result is a new labeled dataset. For dimensionality reduction, we have to use the decision function for data transformation. This function can be retrieved using the decisionFunction() method of the LinearClassifier class. The decision function object can be used to transform the input data into a new projection that can be obtained with the LDA. After we have the new labels and projected data, we can use them to obtain dimensionality reduced data. In the following example, we only used one label, which corresponds to the one dimension of the projection so that we can visualize the result. This means we're performing dimensionality reduction for the only feature (component):

 void LDAReduction(const UnlabeledData<RealVector> &data,
                   const UnlabeledData<RealVector> &labels, 
                   size_t target_dim) {
   LinearClassifier<> encoder;
   LDA lda;
 
   LabeledData<RealVector, unsigned int> dataset(
       labels.numberOfElements(), InputLabelPair<RealVector, unsigned int>(
                                   RealVector(data.element(0).size()), 0));
 
   for (size_t i = 0; i < labels.numberOfElements(); ++i) {
     // labels should start from 0
     dataset.element(i).label =
         static_cast<unsigned int>(labels.element(i)[0]) - 1;
     dataset.element(i).input = data.element(i);
   }
   lda.train(encoder, dataset);
 
   // project data
   auto new_labels = encoder(data);
   auto dc = encoder.decisionFunction();
   auto new_data = dc(data);
 
   for (size_t i = 0; i < new_data.numberOfElements(); ++i) {
     auto l = new_labels.element(i);
     auto x = new_data.element(i)[l];
     auto y = new_data.element(i)[l];
   }
 }

The following graph shows the result of applying Shark-ML LDA dimensionality reduction to our data on the only feature:

Table of Contents for LDA

Create new playlist

Sign In

Sign Up

Table of Contents for
LDA