Now, let's try to train the document CNN model on the full IMDB dataset by transferring the learned Word2vec embeddings.
This code for this is very similar to the preceding IMDB training code. You just need to exclude the weight loading part from the Amazon model. The code is available in the repository, in the module named imdb_model.py. Also, here are the model parameters:
{
"embedding_dim":50,
"train_embedding":true,
"embedding_regularizer_l2":0.0,
"sentence_len":30,
"num_sentences":20,
"word_kernel_size":5,
"word_filters":30,
"sent_kernel_size":5,
"sent_filters":16,
"sent_k_maxpool":3,
"input_dropout":0.4,
"doc_k_maxpool":5,
"sent_dropout":0.2,
"hidden_dims":64,
"conv_activation":"relu",
"hidden_activation":"relu",
"hidden_dropout":0,
"num_hidden_layers":1,
"hidden_gaussian_noise_sd":0.3,
"final_layer_kernel_regularizer":0.04,
"hidden_layer_kernel_regularizer":0.0,
"learn_word_conv":true,
"learn_sent_conv":true,
"num_units_final_layer":1
}
While training, we have used another trick to avoid overfitting. We freeze the embedding layers (that is, train_embedding=False) after the first 10 epochs and only train the remaining layers. After 50 epochs, we achieved 89% accuracy on the IMDB dataset, which is the result claimed in this paper. We observed that if we don't initialize the embedding weights before training, the model starts overfitting and is not able to achieve accuracy validation beyond 80%.