There is more...

On the internet, you can find more experiments performed by Avi Singh (https://avisingh599.github.io/deeplearning/visual-qa/)  where different models are compared, including a simple 'bag-of-words' for language together with CNN for images, an LSTM-only model, and an LSTM+CNN model - similar to the one discussed in this recipe. The blog posting also discusses different training strategies for each model.

In addition to that, interested readers can find on the internet (https://github.com/anujshah1003/VQA-Demo-GUI) a nice GUI built on the top of Avi Singh's demo which allows you to interactively load images and ask related questions. A YouTube video is also available (https://www.youtube.com/watch?v=7FB9PvzOuQY).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.166.152