Understanding multilayer perceptrons

In order to create nonlinear decision boundaries, we can combine multiple perceptrons to form a larger network. This is also known as a multilayer perceptron (MLP). MLPs usually consist of at least three layers, where the first layer has a node (or neuron) for every input feature of the dataset, and the last layer has a node for every class label. The layer in between is called the hidden layer.

An example of this feedforward neural network architecture is shown in the following diagram:

In this network, every circle is an artificial neuron (or, essentially, a perceptron), and the output of one artificial neuron might serve as input to the next artificial neuron, much like how real biological neurons are wired up in the brain. By placing perceptrons side by side, we get a single one-layer neural network. Analogously, by stacking one one-layer neural network upon the other, we get a multilayer neural network—an MLP.

One remarkable property of MLPs is that they can represent any mathematical function if you make the network big enough. This is also known as the universal approximation property. For example, a one hidden-layer MLP (such as the one shown in the preceding diagram) is able to exactly represent the following:

Any Boolean function (such as AND, OR, NOT, NAND, and so on)
Any bounded continuous function (such as the sine or cosine functions)

Even better, if you add another layer, an MLP can approximate any arbitrary function. You can see why these neural networks are so powerful—if you just give them enough neurons and enough layers, they can basically learn any input-output function!

Table of Contents for Understanding multilayer perceptrons

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding multilayer perceptrons