What makes a neuron recurrent?

Recurrent neural networks have loops, which allow information to persist from one prediction to the next. This means that the output for each neuron depends on both the current input, and the previous outputs of the network, as shown in the following image:

If we were to flatten this diagram out across time, it would look more like the following graph. This idea of the network informing itself is where the term recurrent comes from, although as a CS major I always think of it as a recursive neural network.

In the preceding diagram, we can see that neuron A takes input x_t0 in and outputs h_t0 at time step 0. Then at time step 1, the neuron uses input x_t1, and a signal from it's previous time step, to output h_t1. At time step 2, it now considers it's input x_t2 and the signal from the previous time step, which may still contain information from time step 0. We continue this way until we reach the final time step in the sequence and the network grows it's memory from step to step.

Standard RNNs use a weight matrix to mix in the previous time step's signal with the product of the current time step's input and the hidden weight matrix. This is all combined before feeding it through a non-linear function, most often a hyperbolic tangent function. For each time step this looks like:

Here a_t is a linear combination of the previous time step output and the current time step's input, both parameterized by weight matrices, W and U respectively. Once a_thas been calculated, it's exposed to a non-linear function, most often the hyperbolic tangent h_t. Finally, the neuron's output o_tcombines h_t with a weight matrix, V, and a bias, c.

As you look at this structure, try to imagine a situation where you have some information that's very important, very early in the sequence. As the sequence gets longer, the more likely it is for that important early information to be forgotten as new signals overpower old information easily. Mathematically, the gradient of the unit will either vanish or explode.

This is a major shortcoming of standard RNNs. In practice, traditional RNNs struggle to learn really long-term interactions in a sequence. They're forgetful!

Next, let's take a look at Long Short Term Memory Networks, which can overcome this limitation.

Table of Contents for What makes a neuron recurrent?

Create new playlist

Sign In

Sign Up

Table of Contents for
What makes a neuron recurrent?