Training the classifier

Creating a logistic regression classifier involves pretty much the same steps as setting up k-NN:

In [14]: lr = cv2.ml.LogisticRegression_create()

We then have to specify the desired training method. Here, we can choose cv2.ml.LogisticRegression_BATCH or cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to know is that we want to update the model after every data point, which can be achieved with the following code:

In [15]: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH)
...      lr.setMiniBatchSize(1)

We also want to specify the number of iterations the algorithm should run before it terminates:

In [16]: lr.setIterations(100)

We can then call the train method of the object (in the exact same way as we did earlier), which will return True upon success:

In [17]: lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train)
Out[17]: True

As we just saw, the goal of the training phase is to find a set of weights that best transform the feature values into an output label. A single data point is given by its four feature values (f₀, f₁, f₂, and f₃). Since we have four features, we should also get four weights, so that x = w₀ f₀ + w₁ f₁ + w₂ f₂ + w₃ f₃, and ŷ=σ(x). However, as discussed previously, the algorithm adds an extra weight that acts as an offset or bias, so that x = w₀ f₀ + w₁ f₁ + w₂ f₂ + w₃ f₃ + w₄. We can retrieve these weights as follows:

In [18]: lr.get_learnt_thetas()
Out[18]: array([[-0.04090132, -0.01910266, -0.16340332, 0.28743777, 0.11909772]], dtype=float32)

This means that the input to the logistic function is x = -0.0409 f₀ - 0.0191 f₁ - 0.163 f₂ + 0.287 f₃ + 0.119. Then, when we feed in a new data point (f₀, f₁, f₂, f₃) that belongs to class 1, the output ŷ=σ(x) should be close to 1. But how well does that actually work?

Table of Contents for Training the classifier

Create new playlist

Sign In

Sign Up

Table of Contents for
Training the classifier