Softmax activation

From the preceding section, we should notice that in the case of categorical variable prediction, the number of units in the output layer would be the same as the number of distinct values in the dependent variable.

Also, note that the predicted value cannot be greater than 1 or less than 0 for any of the units in the output layer. At the same time, the sum of the values across all nodes in the output should be equal to 1.

For example, let's say the output across two nodes of output is -1 and 5. Given that the expected value of outputs should be between 0 and 1 (the probability of an event happening), we pass the output values through softmax activation, as follows:

Pass the values through an exponential function:

exp(-1) = 0.367

exp(5) = 148

Normalize the output values to obtain a probability between 0 to 1 and also to ensure that the sum of probabilities between the two output nodes is 1:

0.367/(0.367+148) =0.001

148/(0.367+148) = 0.999

Thus, the softmax activation helps us in converting the output values into probability numbers.

Table of Contents for Softmax activation

Create new playlist

Sign In

Sign Up

Table of Contents for
Softmax activation