Training on the full IMDB dataset with Word2vec embeddings

Now, let's try to train the document CNN model on the full IMDB dataset by transferring the learned Word2vec embeddings.

Note, we are not using the learned weights from the Amazon Review model. We will be training the model from scratch. In fact, this is what was done in the paper.

This code for this is very similar to the preceding IMDB training code. You just need to exclude the weight loading part from the Amazon model. The code is available in the repository, in the module named imdb_model.py. Also, here are the model parameters:

{  
    "embedding_dim":50,
    "train_embedding":true,
    "embedding_regularizer_l2":0.0,
    "sentence_len":30,
    "num_sentences":20,
    "word_kernel_size":5,
    "word_filters":30,
    "sent_kernel_size":5,
    "sent_filters":16,
    "sent_k_maxpool":3,
    "input_dropout":0.4,
    "doc_k_maxpool":5,
    "sent_dropout":0.2,
    "hidden_dims":64,
    "conv_activation":"relu",
    "hidden_activation":"relu",
    "hidden_dropout":0,
    "num_hidden_layers":1,
    "hidden_gaussian_noise_sd":0.3,
    "final_layer_kernel_regularizer":0.04,
    "hidden_layer_kernel_regularizer":0.0,
    "learn_word_conv":true,
    "learn_sent_conv":true,
    "num_units_final_layer":1
 }

While training, we have used another trick to avoid overfitting. We freeze the embedding layers (that is, train_embedding=False) after the first 10 epochs and only train the remaining layers. After 50 epochs, we achieved 89% accuracy on the IMDB dataset, which is the result claimed in this paper. We observed that if we don't initialize the embedding weights before training, the model starts overfitting and is not able to achieve accuracy validation beyond 80%.

Table of Contents for Training on the full IMDB dataset with Word2vec embeddings

Create new playlist

Sign In

Sign Up

Table of Contents for
Training on the full IMDB dataset with Word2vec embeddings