Classification using the backpropagation algorithm

The backpropagation (BP) algorithm learns the classification model by training a multilayer feed-forward neural network. The generic architecture of the neural network for BP is shown in the following diagrams, with one input layer, some hidden layers, and one output layer. Each layer contains some units or perceptron. Each unit might be linked to others by weighted connections. The values of the weights are initialized before the training. The number of units in each layer, number of hidden layers, and the connections will be empirically defined at the very start.

Classification using the backpropagation algorithm

The training tuples are assigned to the input layer; each unit in the input layer calculates the result with certain functions and the input attributes from the training tuple, and then the output is served as the input parameter for the hidden layer; the calculation here happened layer by layer. As a consequence, the output of the network contains all the output of each unit in the output layer. The training is performed iteratively by updating the weights of the connections with the feedback of errors, that is, back propagation of the errors.

The prototype for the unit in the hidden/output layer is shown here, the one in the input layer with only one input and other minor differences compared to this. Classification using the backpropagation algorithm denotes the weight related to that link or connection. Each unit is bound with a bias, which is Classification using the backpropagation algorithm. The threshold or activation function bound with each unit is Classification using the backpropagation algorithm.

Classification using the backpropagation algorithm

For a certain unit or perceptron in the hidden or output layer, the net input is a combination of the linear combination of each of its input from the previous unit, that is, the output of it, which is Classification using the backpropagation algorithm. Let k denote the number of input connections of the unit q:

Classification using the backpropagation algorithm

The output of this unit q is Classification using the backpropagation algorithm.

Classification using the backpropagation algorithm

If the unit q is in the output layer and Classification using the backpropagation algorithm denotes the expected or known output value in the training tuple, then its error Classification using the backpropagation algorithm can be calculated as follows:

Classification using the backpropagation algorithm

If the unit q is in the hidden layer, then let Classification using the backpropagation algorithm denote the weight of the connection from unit q to one unit with the error Classification using the backpropagation algorithm in the next layer or the known output value in the training tuple. We can then calculate its error as Classification using the backpropagation algorithm. Let M denote the number of output connections of the unit q:

Classification using the backpropagation algorithm

After this preprocessing or preparation, the weights and biases can be updated accordingly with the backpropagation strategy. Classification using the backpropagation algorithm denotes the learning rate; it is an empirical parameter, which belongs to (0,1):

Classification using the backpropagation algorithm
Classification using the backpropagation algorithm
Classification using the backpropagation algorithm
Classification using the backpropagation algorithm

The weights and biases are updated per tuple of the training tuple dataset.

The BP algorithm

The input parameters for the BP algorithm, the topology of the neural network, the number of hidden layers, and the connections, are defined before the start of the training:

  • D, which is the training tuples set
  • W, which is the initial values for all the weights.
  • The BP algorithm, which is the bias for each units
  • I, which is the learning rate

The output of the algorithm is the topology structure of BP, which consists of:

  • BPNN, which is the trained BP neural network
  • W, which is a set of weights of the connections in the neural network

Here is the pseudocode for the training of the backpropagation network:

The BP algorithm
The BP algorithm

The R implementation

Please look up the R codes file ch_04_bp.R from the bundle of R codes for the BP algorithm. The codes can be tested with the following command:

> source("ch_04_bp.R")

Parallel version with MapReduce

Massive data makes the training process of BP dramatically slow. The parallelized version of BP is proven to accelerate the speed in an amazing way. There are many versions of parallelized BPNN algorithms implemented on the MapReduce architecture. Here is one implementation, the MapReduce-based Backpropagation Neural Network (MBNN) algorithm.

Given the training dataset, each mapper is fed with one single training item. During the mapper tasks stage, new values for weights are calculated. Then, within the reducer tasks stage, new values for one weight are collected, and this results in an average value of these values to output for the weights. After the new values are provided, all the weights are updated, batch-wise. These steps are executed iteratively until the termination condition is matched.

The backpropagation main algorithm is listed as follows:

Parallel version with MapReduce

Four steps compose the backpropagation mapper algorithm. They are listed here:

Parallel version with MapReduce

Five steps compose the backpropagation reducer algorithm. They are listed here:

Parallel version with MapReduce
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.102.178