The backpropagation (BP) algorithm learns the classification model by training a multilayer feed-forward neural network. The generic architecture of the neural network for BP is shown in the following diagrams, with one input layer, some hidden layers, and one output layer. Each layer contains some units or perceptron. Each unit might be linked to others by weighted connections. The values of the weights are initialized before the training. The number of units in each layer, number of hidden layers, and the connections will be empirically defined at the very start.
The training tuples are assigned to the input layer; each unit in the input layer calculates the result with certain functions and the input attributes from the training tuple, and then the output is served as the input parameter for the hidden layer; the calculation here happened layer by layer. As a consequence, the output of the network contains all the output of each unit in the output layer. The training is performed iteratively by updating the weights of the connections with the feedback of errors, that is, back propagation of the errors.
The prototype for the unit in the hidden/output layer is shown here, the one in the input layer with only one input and other minor differences compared to this. denotes the weight related to that link or connection. Each unit is bound with a bias, which is . The threshold or activation function bound with each unit is .
For a certain unit or perceptron in the hidden or output layer, the net input is a combination of the linear combination of each of its input from the previous unit, that is, the output of it, which is . Let k denote the number of input connections of the unit q:
The output of this unit q is .
If the unit q is in the output layer and denotes the expected or known output value in the training tuple, then its error can be calculated as follows:
If the unit q is in the hidden layer, then let denote the weight of the connection from unit q to one unit with the error in the next layer or the known output value in the training tuple. We can then calculate its error as . Let M denote the number of output connections of the unit q:
After this preprocessing or preparation, the weights and biases can be updated accordingly with the backpropagation strategy. denotes the learning rate; it is an empirical parameter, which belongs to (0,1)
:
The weights and biases are updated per tuple of the training tuple dataset.
The input parameters for the BP algorithm, the topology of the neural network, the number of hidden layers, and the connections, are defined before the start of the training:
The output of the algorithm is the topology structure of BP, which consists of:
Here is the pseudocode for the training of the backpropagation network:
Please look up the R codes file ch_04_bp.R
from the bundle of R codes for the BP algorithm. The codes can be tested with the following command:
> source("ch_04_bp.R")
Massive data makes the training process of BP dramatically slow. The parallelized version of BP is proven to accelerate the speed in an amazing way. There are many versions of parallelized BPNN algorithms implemented on the MapReduce architecture. Here is one implementation, the MapReduce-based Backpropagation Neural Network (MBNN) algorithm.
Given the training dataset, each mapper is fed with one single training item. During the mapper tasks stage, new values for weights are calculated. Then, within the reducer tasks stage, new values for one weight are collected, and this results in an average value of these values to output for the weights. After the new values are provided, all the weights are updated, batch-wise. These steps are executed iteratively until the termination condition is matched.
The backpropagation main algorithm is listed as follows:
Four steps compose the backpropagation mapper algorithm. They are listed here:
Five steps compose the backpropagation reducer algorithm. They are listed here:
3.133.142.6