0%

Book Description

Get well-versed with traditional as well as modern natural language processing concepts and techniques

Key Features

  • Perform various NLP tasks to build linguistic applications using Python libraries
  • Understand, analyze, and generate text to provide accurate results
  • Interpret human language using various NLP concepts, methodologies, and tools

Book Description

Natural Language Processing (NLP) is the subfield in computational linguistics that enables computers to understand, process, and analyze text. This book caters to the unmet demand for hands-on training of NLP concepts and provides exposure to real-world applications along with a solid theoretical grounding.

This book starts by introducing you to the field of NLP and its applications, along with the modern Python libraries that you'll use to build your NLP-powered apps. With the help of practical examples, you'll learn how to build reasonably sophisticated NLP applications, and cover various methodologies and challenges in deploying NLP applications in the real world. You'll cover key NLP tasks such as text classification, semantic embedding, sentiment analysis, machine translation, and developing a chatbot using machine learning and deep learning techniques. The book will also help you discover how machine learning techniques play a vital role in making your linguistic apps smart. Every chapter is accompanied by examples of real-world applications to help you build impressive NLP applications of your own.

By the end of this NLP book, you'll be able to work with language data, use machine learning to identify patterns in text, and get acquainted with the advancements in NLP.

What you will learn

  • Understand how NLP powers modern applications
  • Explore key NLP techniques to build your natural language vocabulary
  • Transform text data into mathematical data structures and learn how to improve text mining models
  • Discover how various neural network architectures work with natural language data
  • Get the hang of building sophisticated text processing models using machine learning and deep learning
  • Check out state-of-the-art architectures that have revolutionized research in the NLP domain

Who this book is for

This NLP Python book is for anyone looking to learn NLP's theoretical and practical aspects alike. It starts with the basics and gradually covers advanced concepts to make it easy to follow for readers with varying levels of NLP proficiency. This comprehensive guide will help you develop a thorough understanding of the NLP methodologies for building linguistic applications; however, working knowledge of Python programming language and high school level mathematics is expected.

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Hands-On Python Natural Language Processing
  3. About Packt
    1. Why subscribe?
  4. Contributors
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  6. Section 1: Introduction
  7. Understanding the Basics of NLP
    1. Programming languages versus natural languages
    2. Understanding NLP
    3. Why should I learn NLP?
    4. Current applications of NLP
    5. Chatbots
    6. Sentiment analysis
    7. Machine translation
    8. Named-entity recognition
    9. Future applications of NLP
    10. Summary
  8. NLP Using Python
    1. Technical requirements
    2. Understanding Python with NLP 
    3. Python's utility in NLP
    4. Important Python libraries
    5. NLTK
    6. NLTK corpora
    7. Text processing
    8. Part of speech tagging
    9. Textblob
    10. Sentiment analysis
    11. Machine translation
    12. Part of speech tagging
    13. VADER
    14. Web scraping libraries and methodology
    15. Overview of Jupyter Notebook
    16. Summary
  9. Section 2: Natural Language Representation and Mathematics
  10. Building Your NLP Vocabulary
    1. Technical requirements
    2. Lexicons
    3. Phonemes, graphemes, and morphemes
    4. Tokenization 
    5. Issues with tokenization
    6. Different types of tokenizers
    7. Regular expressions 
    8. Regular expressions-based tokenizers
    9. Treebank tokenizer
    10. TweetTokenizer 
    11. Understanding word normalization
    12. Stemming
    13. Over-stemming and under-stemming
    14. Lemmatization 
    15. WordNet lemmatizer
    16. Spacy lemmatizer
    17. Stopword removal
    18. Case folding
    19. N-grams
    20. Taking care of HTML tags
    21. How does all this fit into my NLP pipeline?
    22. Summary
  11. Transforming Text into Data Structures
    1. Technical requirements
    2. Understanding vectors and matrices
    3. Vectors
    4. Matrices
    5. Exploring the Bag-of-Words architecture
    6. Understanding a basic CountVectorizer
    7. Out-of-the-box features offered by CountVectorizer
    8. Prebuilt dictionary and support for n-grams
    9. max_features
    10. Min_df and Max_df thresholds
    11. Limitations of the BoW representation
    12. TF-IDF vectors
    13. Building a basic TF-IDF vectorizer
    14. N-grams and maximum features in the TF-IDF vectorizer 
    15. Limitations of the TF-IDF vectorizer's representation
    16. Distance/similarity calculation between document vectors
    17. Cosine similarity
    18. Solving Cosine math
    19. Cosine similarity on vectors developed using CountVectorizer
    20. Cosine similarity on vectors developed using TfIdfVectorizers tool
    21. One-hot vectorization
    22. Building a basic chatbot
    23. Summary 
  12. Word Embeddings and Distance Measurements for Text
    1. Technical requirements
    2. Understanding word embeddings
    3. Demystifying Word2vec
    4. Supervised and unsupervised learning
    5. Word2vec – supervised or unsupervised?
    6. Pretrained Word2vec 
    7. Exploring the pretrained Word2vec model using gensim
    8. The Word2vec architecture
    9. The Skip-gram method
    10. How do you define target and context words?
    11. Exploring the components of a Skip-gram model
    12. Input vector
    13. Embedding matrix
    14. Context matrix
    15. Output vector
    16. Softmax
    17. Loss calculation and backpropagation
    18. Inference
    19. The CBOW method
    20. Computational limitations of the methods discussed and how to overcome them
    21. Subsampling
    22. Negative sampling
    23. How to select negative samples
    24. Training a Word2vec model 
    25. Building a basic Word2vec model
    26. Modifying the min_count parameter 
    27. Playing with the vector size
    28. Other important configurable parameters
    29. Limitations of Word2vec
    30. Applications of the Word2vec model 
    31. Word mover’s distance
    32. Summary
  13. Exploring Sentence-, Document-, and Character-Level Embeddings
    1. Technical requirements
    2. Venturing into Doc2Vec
    3. Building a Doc2Vec model
    4. Changing vector size and min_count 
    5. The dm parameter for switching between modeling approaches
    6. The dm_concat parameter
    7. The dm_mean parameter
    8. Window size
    9. Learning rate
    10. Exploring fastText 
    11. Building a fastText model
    12. Building a spelling corrector/word suggestion module using fastText
    13. fastText and document distances
    14. Understanding Sent2Vec and the Universal Sentence Encoder
    15. Sent2Vec
    16. The Universal Sentence Encoder
    17. Summary 
  14. Section 3: NLP and Learning
  15. Identifying Patterns in Text Using Machine Learning
    1. Technical requirements
    2. Introduction to ML
    3. Data preprocessing
    4. NaN values
    5. Label encoding and one-hot encoding
    6. Data standardization
    7. Min-max standardization
    8. Z-score standardization
    9. The Naive Bayes algorithm
    10. Building a sentiment analyzer using the Naive Bayes algorithm
    11. The SVM algorithm
    12. Building a sentiment analyzer using SVM
    13. Productionizing a trained sentiment analyzer
    14. Summary 
  16. From Human Neurons to Artificial Neurons for Understanding Text
    1. Technical requirements
    2. Exploring the biology behind neural networks
    3. Neurons
    4. Activation functions
    5. Sigmoid
    6. Tanh activation
    7. Rectified linear unit
    8. Layers in an ANN
    9. How does a neural network learn?
    10. How does the network get better at making predictions?
    11. Understanding regularization
    12. Dropout
    13. Let's talk Keras
    14. Building a question classifier using neural networks
    15. Summary
  17. Applying Convolutions to Text
    1. Technical requirements
    2. What is a CNN?
    3. Understanding convolutions
    4. Let's pad our data
    5. Understanding strides in a CNN
    6. What is pooling?
    7. The fully connected layer
    8. Detecting sarcasm in text using CNNs
    9. Loading the libraries and the dataset
    10. Performing basic data analysis and preprocessing our data
    11. Loading the Word2Vec model and vectorizing our data
    12. Splitting our dataset into train and test sets
    13. Building the model
    14. Evaluating and saving our model
    15. Summary
  18. Capturing Temporal Relationships in Text
    1. Technical requirements
    2. Baby steps toward understanding RNNs
    3. Forward propagation in an RNN
    4. Backpropagation through time in an RNN
    5. Vanishing and exploding gradients
    6. Architectural forms of RNNs
    7. Different flavors of RNN
    8. Carrying relationships both ways using bidirectional RNNs
    9. Going deep with RNNs
    10. Giving memory to our networks – LSTMs
    11. Understanding an LSTM cell
    12. Forget gate
    13. Input gate
    14. Output gate
    15. Backpropagation through time in LSTMs
    16. Building a text generator using LSTMs
    17. Exploring memory-based variants of the RNN architecture
    18. GRUs
    19. Stacked LSTMs
    20. Summary
  19. State of the Art in NLP
    1. Technical requirements
    2. Seq2Seq modeling
    3. Encoders
    4. Decoders
    5. The training phase
    6. The inference phase
    7. Translating between languages using Seq2Seq modeling 
    8. Let's pay some attention
    9. Transformers 
    10. Understanding the architecture of Transformers
    11. Encoders 
    12. Decoders
    13. Self-attention
    14. How does self-attention work mathematically?
    15. A small note on masked self-attention
    16. Feedforward neural networks
    17. Residuals and layer normalization
    18. Positional embeddings
    19. How the decoder works
    20. The linear layer and the softmax function
    21. Transformer model summary
    22. BERT 
    23. The BERT architecture
    24. The BERT model input and output
    25. How did BERT the pre-training happen?
    26. The masked language model
    27. Next-sentence prediction 
    28. BERT fine-tuning
    29. Summary
  20. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think
18.188.142.146