Keras embedding layer

The Keras embedding layer allows us to learn a vector space representation of an input word, like we did in word2vec, as we train our model. Using the functional API, the Keras embedding layer is always the second layer in the network, coming after the input layer.

The embedding layer needs the following three arguments:

  • input_dim: The size of the vocabulary of the corpus.
  • output_dim: The size of the vector space we want to learn. This would correspond to the number of neurons in word2vec hidden layer.
  • input_length: The number of words in the text we're going to use in each observation. In the examples that follow, we will use a fixed size based on the longest text we need to send and we will pad smaller documents with 0s.

An embedding layer will output a 2D matrix for each input document that contains one vector for each word in the sequence specified by input_length.

As an example, we may have an embedding layer that looks like this:

Embedding(input_dim=10000, output_dim=128, input_length=10)

In this case, the output of this layer would be a 2D matrix of shape 10 x 128, where each document's 10 words would have a 128-element vector associated with it.

Sequences of words like this serve as excellent inputs to LSTMs. An LSTM layer can immediately follow an embedding layer. We can treat these 10 rows from the embedding layer as sequenced input for an LSTM, exactly like we did in the previous chapter. I'll be using an LSTM in the first example for this chapter, so if you arrived here without reading Chapter 9, Training an RNN from scratch, you might want to take a moment to refresh yourself on the operation of LSTMs, which can be found there.

If we wanted to connect an embedding layer directly to a dense layer, we would need to flatten it, but you probably don't want to do that. Using an LSTM is usually a better choice if you have a sequenced text though. There is one other interesting option we should explore though.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.82.4