0%

One-stop solution for NLP practitioners, ML developers, and data scientists to build effective NLP systems that can perform real-world complicated tasks

Key Features

  • Apply deep learning algorithms and techniques such as BiLSTMS, CRFs, BPE and more using TensorFlow 2
  • Explore applications like text generation, summarization, weakly supervised labelling and more
  • Read cutting edge material with seminal papers provided in the GitHub repository with full working code

Book Description

Recently, there have been tremendous advances in NLP, and we are now moving from research labs into practical applications. This book comes with a perfect blend of both the theoretical and practical aspects of trending and complex NLP techniques.

The book is focused on innovative applications in the field of NLP, language generation, and dialogue systems. It helps you apply the concepts of pre-processing text using techniques such as tokenization, parts of speech tagging, and lemmatization using popular libraries such as Stanford NLP and SpaCy. You will build Named Entity Recognition (NER) from scratch using Conditional Random Fields and Viterbi Decoding on top of RNNs.

The book covers key emerging areas such as generating text for use in sentence completion and text summarization, bridging images and text by generating captions for images, and managing dialogue aspects of chatbots. You will learn how to apply transfer learning and fine-tuning using TensorFlow 2.

Further, it covers practical techniques that can simplify the labelling of textual data. The book also has a working code that is adaptable to your use cases for each tech piece.

By the end of the book, you will have an advanced knowledge of the tools, techniques and deep learning architecture used to solve complex NLP problems.

What you will learn

  • Grasp important pre-steps in building NLP applications like POS tagging
  • Use transfer and weakly supervised learning using libraries like Snorkel
  • Do sentiment analysis using BERT
  • Apply encoder-decoder NN architectures and beam search for summarizing texts
  • Use Transformer models with attention to bring images and text together
  • Build apps that generate captions and answer questions about images using custom Transformers
  • Use advanced TensorFlow techniques like learning rate annealing, custom layers, and custom loss functions to build the latest DeepNLP models

Who this book is for

This is not an introductory book and assumes the reader is familiar with basics of NLP and has fundamental Python skills, as well as basic knowledge of machine learning and undergraduate-level calculus and linear algebra.

The readers who can benefit the most from this book include intermediate ML developers who are familiar with the basics of supervised learning and deep learning techniques and professionals who already use TensorFlow/Python for purposes such as data science, ML, research, analysis, etc.

Table of Contents

  1. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  2. Essentials of NLP
    1. A typical text processing workflow
    2. Data collection and labeling
    3. Collecting labeled data
    4. Development environment setup
    5. Enabling GPUs on Google Colab
    6. Text normalization
    7. Modeling normalized data
    8. Tokenization
    9. Segmentation in Japanese
    10. Modeling tokenized data
    11. Stop word removal
    12. Modeling data with stop words removed
    13. Part-of-speech tagging
    14. Modeling data with POS tagging
    15. Stemming and lemmatization
    16. Vectorizing text
    17. Count-based vectorization
    18. Modeling after count-based vectorization
    19. Term Frequency-Inverse Document Frequency (TF-IDF)
    20. Modeling using TF-IDF features
    21. Word vectors
    22. Pretrained models using Word2Vec embeddings
    23. Summary
  3. Understanding Sentiment in Natural Language with BiLSTMs
    1. Natural language understanding
    2. Bi-directional LSTMs – BiLSTMs
    3. RNN building blocks
    4. Long short-term memory (LSTM) networks
    5. Gated recurrent units (GRUs)
    6. Sentiment classification with LSTMs
    7. Loading the data
    8. Normalization and vectorization
    9. LSTM model with embeddings
    10. BiLSTM model
    11. Summary
  4. Named Entity Recognition (NER) with BiLSTMs, CRFs, and Viterbi Decoding
    1. Named Entity Recognition
    2. The GMB data set
    3. Loading the data
    4. Normalizing and vectorizing data
    5. A BiLSTM model
    6. Conditional random fields (CRFs)
    7. NER with BiLSTM and CRFs
    8. Implementing the custom CRF layer, loss, and model
    9. A custom CRF model
    10. A custom loss function for NER using a CRF
    11. Implementing custom training
    12. Viterbi decoding
    13. The probability of the first word label
    14. Summary
  5. Transfer Learning with BERT
    1. Transfer learning overview
    2. Types of transfer learning
    3. Domain adaptation
    4. Multi-task learning
    5. Sequential learning
    6. IMDb sentiment analysis with GloVe embeddings
    7. GloVe embeddings
    8. Loading IMDb training data
    9. Loading pre-trained GloVe embeddings
    10. Creating a pre-trained embedding matrix using GloVe
    11. Feature extraction model
    12. Fine-tuning model
    13. BERT-based transfer learning
    14. Encoder-decoder networks
    15. Attention model
    16. Transformer model
    17. The bidirectional encoder representations from transformers (BERT) model
    18. Tokenization and normalization with BERT
    19. Pre-built BERT classification model
    20. Custom model with BERT
    21. Summary
  6. Generating Text with RNNs and GPT-2
    1. Generating text – one character at a time
    2. Data loading and pre-processing
    3. Data normalization and tokenization
    4. Training the model
    5. Implementing learning rate decay as custom callback
    6. Generating text with greedy search
    7. Generative Pre-Training (GPT-2) model
    8. Generating text with GPT-2
    9. Summary
  7. Text Summarization with Seq2seq Attention and Transformer Networks
    1. Overview of text summarization
    2. Data loading and pre-processing
    3. Data tokenization and vectorization
    4. Seq2seq model with attention
    5. Encoder model
    6. Bahdanau attention layer
    7. Decoder model
    8. Training the model
    9. Generating summaries
    10. Greedy search
    11. Beam search
    12. Decoding penalties with beam search
    13. Evaluating summaries
    14. ROUGE metric evaluation
    15. Summarization – state of the art
    16. Summary
  8. Multi-Modal Networks and Image Captioning with ResNets and Transformer Networks
    1. Multi-modal deep learning
    2. Vision and language tasks
    3. Image captioning
    4. MS-COCO dataset for image captioning
    5. Image processing with CNNs and ResNet50
    6. CNNs
    7. Convolutions
    8. Pooling
    9. Regularization with dropout
    10. Residual connections and ResNets
    11. Image feature extraction with ResNet50
    12. The Transformer model
    13. Positional encoding and masks
    14. Scaled dot-product and multi-head attention
    15. VisualEncoder
    16. Decoder
    17. Transformer
    18. Training the Transformer model with VisualEncoder
    19. Loading training data
    20. Instantiating the Transformer model
    21. Custom learning rate schedule
    22. Loss and metrics
    23. Checkpoints and masks
    24. Custom training
    25. Generating captions
    26. Improving performance and state-of-the-art models
    27. Summary
  9. Weakly Supervised Learning for Classification with Snorkel
    1. Weak supervision
    2. Inner workings of weak supervision with labeling functions
    3. Using weakly supervised labels to improve IMDb sentiment analysis
    4. Pre-processing the IMDb dataset
    5. Learning a subword tokenizer
    6. A BiLSTM baseline model
    7. Tokenization and vectorizing data
    8. Training using a BiLSTM model
    9. Weakly supervised labeling with Snorkel
    10. Iterating on labeling functions
    11. Naïve-Bayes model for finding keywords
    12. Evaluating weakly supervised labels on the training set
    13. Generating unsupervised labels for unlabeled data
    14. Training BiLSTM on weakly supervised data from Snorkel
    15. Summary
  10. Building Conversational AI Applications with Deep Learning
    1. Overview of conversational agents
    2. Task-oriented or slot-filling systems
    3. Question-answering and MRC conversational agents
    4. General conversational agents
    5. Summary
    6. Epilogue
  11. Installation and Setup Instructions for Code
    1. GitHub location
    2. Chapter 1 installation instructions
    3. Chapter 2 installation instructions
    4. Chapter 3 installation instructions
    5. Chapter 4 installation instructions
    6. Chapter 5 installation instructions
    7. Chapter 6 installation instructions
    8. Chapter 7 installation instructions
    9. Chapter 8 installation instructions
    10. Chapter 9 installation instructions
  12. Other Books You May Enjoy
  13. Index
3.21.248.47