Bag of Word models

The Bag of Word (BoW) models are NLP models that really disregard sentence structure and word placement. In a Bag of Word model, we treat each document as a bag of words. It's easy to imagine just that. Each document is a container that holds a big set of words. We ignore sentences, structure, and which words come first or last. We concern ourselves with the fact that the document contains the words very, good, and bad but we don't really care that very comes before good, but not bad.

Bag of Word models are simple, require relatively little data, and work amazingly well considering the naivety of the model.

Note, the use of model here means representation. I'm not referring to a deep learning model or machine learning model in the specific sense. Rather, a model in this context means a way to represent text.

Given some document, that consists of a set of words, a strategy needs to be defined to convert a word to a number. We will look at a few strategies in a moment, but first we need to briefly discuss stemming, lemmatization, and stop words.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.97.187