NLP terminology

Let's start with by defining a few common terms, so that we remove any ambiguity their use might cause. I know that, since you can read, you likely have some understanding of these terms. I apologize if this seems pedantic, but I do promise it will immediately relate to the models we talk about next:

  • Words: The atomic element of most of the systems we will be using. While some character level models do exist, we won't be talking about them today.
  • Sentence: A collection of words that expresses a statement, question, and so on. 
  • Document: A document is a collection of sentences. It might be a sentence, or more likely multiple sentences.
  • Corpus: A collection of documents.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.