Customizing the MLP classifier

Before we move on to training the classifier, we can customize the MLP classifier via a number of optional settings:

mlp.setActivationFunction: This defines the activation function to be used for every neuron in the network.
mlp.setTrainMethod: This defines a suitable training method.
mlp.setTermCriteria: This sets the termination criteria of the training phase.

Whereas our home-brewed perceptron classifier used a linear activation function, OpenCV provides two additional options:

cv2.ml.ANN_MLP_IDENTITY: This is the linear activation function, f(x) = x.
cv2.ml.ANN_MLP_SIGMOID_SYM: This is the symmetrical sigmoid function (also known as hyperbolic tangent), f(x) = β (1 - exp(-α x)) / (1 + exp(-α x)). Whereas α controls the slope of the function, β defines the upper and lower bounds of the output.
cv2.ml.ANN_GAUSSIAN: This is the Gaussian function (also known as the bell curve), f(x) = β exp(-α x²). Whereas α controls the slope of the function, β defines the upper bound of the output.

Careful, OpenCV is being really confusing here! What they call a symmetrical sigmoid function is in reality the hyperbolic tangent. In other software, a sigmoid function usually has responses in the range [0, 1], whereas the hyperbolic tangent has responses in the range [-1, 1]. Even worse, if you use the symmetrical sigmoid function with its default parameters, the output will range in [-1.7159, 1.7159]!

In this example, we will use a proper sigmoid function that squashes the input values into the range [0, 1]. We do this by choosing α = 2.5 and β = 1.0. α and β are the two parameters of the activation function, which have a default value set to zero.:

In [6]: mlp.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM, 2.5, 1.0)

If you are curious about what this activation function looks like, we can take a short excursion with matplotlib:

In [7]: import matplotlib.pyplot as plt
...     %matplotlib inline
...     plt.style.use('ggplot')

In order to see what the activation function looks like, we can create a NumPy array that densely samples x values in the range [-1, 1], and then calculate the corresponding y values using the preceding mathematical expression:

In [8]: alpha = 2.5
...     beta = 1.0
...     x_sig = np.linspace(-1.0, 1.0, 100)
...     y_sig = beta * (1.0 - np.exp(-alpha * x_sig))
...     y_sig /= (1 + np.exp(-alpha * x_sig))
...     plt.plot(x_sig, y_sig, linewidth=3);
...     plt.xlabel('x')
...     plt.ylabel('y')

As you can see in the following graph, this will create a nice squashing function whose output values will lie in the range [-1, 1]:

The preceding graph shows an example of the symmetrical sigmoid function for α = 2.5 and β = 1.0. As mentioned in the preceding part, a training method can be set via mlp.setTrainMethod. The following methods are available:

cv2.ml.ANN_MLP_BACKPROP: This is the backpropagation algorithm we talked about previously. You can set additional scaling factors via mlp.setBackpropMomentumScale and mlp.setBackpropWeightScale.
cv2.ml.ANN_MLP_RPROP: This is the Rprop algorithm, which is short for resilient backpropagation. We won't have time to discuss this algorithm, but you can set additional parameters of this algorithm via mlp.setRpropDW0, mlp.setRpropDWMax, mlp.setRpropDWMin, mlp.setRpropDWMinus, and mlp.setRpropDWPlus.

In this example, we will choose backpropagation:

In [9]: mlp.setTrainMethod(cv2.ml.ANN_MLP_BACKPROP)

Lastly, we can specify the criteria that must be met for training to end via mlp.setTermCriteria. This works the same for every classifier in OpenCV and is closely tied to the underlying C++ functionality. We first tell OpenCV which criteria we are going to specify (for example, the maximum number of iterations). Then we specify the value for this criterion. All values must be delivered in a tuple.

Hence, in order to run our MLP classifier until we either reach 300 iterations or the error does not increase anymore beyond some small range of values, we would write the following:

In [10]: term_mode = cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS
...      term_max_iter = 300
...      term_eps = 0.01
...      mlp.setTermCriteria((term_mode, term_max_iter, term_eps))

Then we are ready to train the classifier!

Table of Contents for Customizing the MLP classifier

Create new playlist

Sign In

Sign Up

Table of Contents for
Customizing the MLP classifier