Encoders and decoders

Sequence-to-sequence models are composed of two separate components, an encoder and a decoder:

Encoder: The encoder portion of the model takes an input sequence and returns an output and the network's internal state. We don't really care about the output; we only want to keep the encoder's state, which is the memory of the input sequence.
Decoder: The decoder portion of the model then takes the state from the encoder, which is called the context or conditioning, as input. It then predicts the target sequence at each time step given the output of the previous time step.

The encoder and decoder then work together as pictured below, taking an input sequence and generating an output sequence. As you can see, we use special characters to represent the start and end of the sequence.

We know to stop generating output once the end of sequence character, which I'll call <EOS> is generated:

While this example covers machine translation, other applications of sequence-to-sequence learning work exactly the same way.

Table of Contents for Encoders and decoders

Create new playlist

Sign In

Sign Up

Table of Contents for
Encoders and decoders