How neural language models learn usage in context

Word embeddings result from training a shallow neural network to predict a word given its context. Whereas traditional language models define context as the words preceding the target, word-embedding models use the words contained in a symmetric window surrounding the target. In contrast, the bag-of-words model uses the entirety of documents as context and uses (weighted) counts to capture the cooccurrence of words rather than predictive vectors.

Earlier neural language models that were used included nonlinear hidden layers that increased the computational complexity. Word2vec and its extensions simplified the architecture to enable training on large datasets (Wikipedia, for example, contains over two billion tokens; see Chapter 17, Deep Learning, for additional details on feed-forward networks).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.93.207