Non-linear neural units

A linear neuron is simple but computationally limited. Even if we use a deep stack of multiple layers of linear units, we still have a linear network capable of learning only linear transformations. To design networks that can learn much richer sets of transformations (non-linear), we need a way to introduce non-linearity in the design of neural nets. By passing the linear weighted sum of input through a non-linear function, we can induce non-linearity in the neural unit.

Although the non-linear function is fixed, it can adapt to the data through the weights of the linear unit, which are arguments to this function. This non-linear function is called an activation function of a non-linear neuron. One simple activation function example is binary threshold activation, and the corresponding non-linear unit is called McCulloch-Pitts unit. This is a step function and it's not differentiable at zero. Also, at non-zero points, its derivative is zero. Other, popularly used activation functions are sigmoid, tanh, and ReLu. The definitions and plots of these functions are provided in the following figure:

Activation function plots

 Here are the activation function definitions:

Function name

Definition

Binary threshold

Sigmoid

Tanh

ReLu

 OR 

 

If we have a k-class (k > 2) classification problem, we basically want to learn the conditional probability distribution, P(y|x). Thus, the output layer should have k neurons, and their value should sum to 1. To provide the network with the knowledge that the output of all k units should sum to 1, the softmax activation function is used. This is a generalization of the sigmoid activation. The softmax function squashes the output of each unit to be between 0 and 1, just like a sigmoid function.

Also, it divides each output such that the total sum of the outputs is equal to 1:

Mathematically the softmax function is shown as follows, where z is a vector of the input to the output layer (if you have 10 output units, then there are 10 elements in z). And again, j indexes the output units, so j = 1, 2, ..., K:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.131.142.80