Convolutional neural networks

Deep learning neural networks have the same background as the classical neural network. However, in the case of image analysis, the main difference is the input layer. In a classical machine learning algorithm, the researcher has to identify the best features that define the image target to classify. For example, if we want to classify numbers, we could extract the borders and lines of numbers in each image, measure the area of an object in an image, and all of these features are the input of the neural network, or any other machine learning algorithm. However, in deep learning, you don't have to explore what the features are; instead, you use whole image as an input of the neural network directly. Deep learning can learn what the most important features are and deep neural networks (DNN) are able to detect an image or input and recognize it.

To learn what these features are, we use one of the most important layers in deep learning and neural networks: the convolutional layer. A convolutional layer works like a convolutional operator, where a kernel filter is applied to the whole previous layer, giving us a new filtered image, like a sobel operator:

However, in a convolutional layer we can define different parameters, and one of them is the number of filters and the sizes we want to apply to the previous layer or image. These filters are calculated in the learning step, just like the weights on a classical neural network. This is the magic of deep learning: it can extract the most significant features from labeled images.

However, these convolutional layers are the main reason behind the name deep, and we are going to see why in the following basic example. Imagine we have a 100 x 100 image. In a classical neural network, we will extract the most relevant features we can imagine from the input image. This will normally approximately 1,000 features, and with each hidden layer we can increase or decrease this number, but the number of neurons to calculate its weights is reasonable to compute in a normal computer. However, in deep learning, we normally start applying a convolutional layer – with a 64 filter kernels of 3 x 3 size. This will generate a new layer of 100 x 100 x 64 neurons with 3 x 3 x 64 weights to calculate. If we continue adding more and more layers, these numbers quickly increase and require huge computing power to learn the good weights and parameters of our deep learning architecture.

Convolutional layers are one of the most important aspects of the deep learning architecture, but there are also other important layers, such as Pooling, Dropout, Flatten, and Softmax. In the following diagram, we can see a basic deep learning architecture in which some convolutional and pooling layers are stacked:

However, there is one more very important thing that makes deep learning get the best results: the amount of labeled data. If you have a small dataset, a deep learning algorithm will not help you in your classification because there is not enough data to learn the features (the weights and parameters of your deep learning architecture). However, if you have tons of data, you will get very good results. But take care, you will need a lot of time to compute and learn the weights and parameters of your architecture. This is why deep learning was not used early in the process, because computing requires a lot of time. However, thanks to new parallel architectures, such as NVIDIA GPUs, we can optimize the learning backpropagation and speed up the learning tasks.

Table of Contents for Convolutional neural networks

Create new playlist

Sign In

Sign Up

Table of Contents for
Convolutional neural networks