Nonlinear activation – ReLU

It is a common and a best practice to have a nonlinear layer after max-pooling, or after convolution is applied. Most of the network architectures tend to use ReLU or different flavors of ReLU. Whatever nonlinear function we choose, it gets applied to each element of the feature maps. To make it more intuitive, let's look at an example where we apply ReLU for the same feature map to which we applied max-pooling and average pooling:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.