Training the classifier

Creating a logistic regression classifier involves pretty much the same steps as setting up k-NN:

In [14]: lr = cv2.ml.LogisticRegression_create()

We then have to specify the desired training method. Here, we can choose cv2.ml.LogisticRegression_BATCH or cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to know is that we want to update the model after every data point, which can be achieved with the following code:

In [15]: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH)
... lr.setMiniBatchSize(1)

We also want to specify the number of iterations the algorithm should run before it terminates:

In [16]: lr.setIterations(100)

We can then call the train method of the object (in the exact same way as we did earlier), which will return True upon success:

In [17]: lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train)
Out[17]: True

As we just saw, the goal of the training phase is to find a set of weights that best transform the feature values into an output label. A single data point is given by its four feature values (f0, f1, f2, and f3). Since we have four features, we should also get four weights, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3, and ŷ=σ(x). However, as discussed previously, the algorithm adds an extra weight that acts as an offset or bias, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3 + w4. We can retrieve these weights as follows:

In [18]: lr.get_learnt_thetas()
Out[18]: array([[-0.04090132, -0.01910266, -0.16340332, 0.28743777, 0.11909772]], dtype=float32)

This means that the input to the logistic function is x = -0.0409 f0 - 0.0191 f1 - 0.163 f2 + 0.287 f3 + 0.119. Then, when we feed in a new data point (f0, f1, f2, f3) that belongs to class 1, the output ŷ=σ(x) should be close to 1. But how well does that actually work?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.21.158