contents

preface

acknowledgments

about this book

about the author

about the cover illustration

Part 1 Basics

1 Introduction to natural language processing

1.1 What is natural language processing (NLP)?

What is NLP?

What is not NLP?

AI, ML, DL, and NLP

Why NLP?

1.2 How NLP is used

NLP applications

NLP tasks

1.3 Building NLP applications

Development of NLP applications

Structure of NLP applications

2 Your first NLP application

2.1 Introducing sentiment analysis

2.2 Working with NLP datasets

What is a dataset?

Stanford Sentiment Treebank

Train, validation, and test sets

Loading SST datasets using AllenNLP

2.3 Using word embeddings

What are word embeddings?

Using word embeddings for sentiment analysis

2.4 Neural networks

What are neural networks?

Recurrent neural networks (RNNs) and linear layers

Architecture for sentiment analysis

2.5 Loss functions and optimization

2.6 Training your own classifier

Batching

Putting everything together

2.7 Evaluating your classifier

2.8 Deploying your application

Making predictions

Serving predictions

3 Word and document embeddings

3.1 Introducing embeddings

What are embeddings?

Why are embeddings important?

3.2 Building blocks of language: Characters, words, and phrases

Characters

Words, tokens, morphemes, and phrases

N-grams

3.3 Tokenization, stemming, and lemmatization

Tokenization

Stemming

Lemmatization

3.4 Skip-gram and continuous bag of words (CBOW)

Where word embeddings come from

Using word associations

Linear layers

Softmax

Implementing Skip-gram on AllenNLP

Continuous bag of words (CBOW) model

3.5 GloVe

How GloVe learns word embeddings

Using pretrained GloVe vectors

3.6 fastText

Making use of subword information

Using the fastText toolkit

3.7 Document-level embeddings

3.8 Visualizing embeddings

4 Sentence classification

4.1 Recurrent neural networks (RNNs)

Handling variable-length input

RNN abstraction

Simple RNNs and nonlinearity

4.2 Long short-term memory units (LSTMs) and gated recurrent units (GRUs)

Vanishing gradients problem

Long short-term memory (LSTM)

Gated recurrent units (GRUs)

4.3 Accuracy, precision, recall, and F-measure

Accuracy

Precision and recall

F-measure

4.4 Building AllenNLP training pipelines

Instances and fields

Vocabulary and token indexers

Token embedders and RNNs

Building your own model

Putting it all together

4.5 Configuring AllenNLP training pipelines

4.6 Case study: Language detection

Using characters as input

Creating a dataset reader

Building the training pipeline

Running the detector on unseen instances

5 Sequential labeling and language modeling

5.1 Introducing sequential labeling

What is sequential labeling?

Using RNNs to encode sequences

Implementing a Seq2Seq encoder in AllenNLP

5.2 Building a part-of-speech tagger

Reading a dataset

Defining the model and the loss

Building the training pipeline

5.3 Multilayer and bidirectional RNNs

Multilayer RNNs

Bidirectional RNNs

5.4 Named entity recognition

What is named entity recognition?

Tagging spans

Implementing a named entity recognizer

5.5 Modeling a language

What is a language model?

Why are language models useful?

Training an RNN language model

5.6 Text generation using RNNs

Feeding characters to an RNN

Evaluating text using a language model

Generating text using a language model

Part 2 Advanced models

6 Sequence-to-sequence models

6.1 Introducing sequence-to-sequence models

6.2 Machine translation 101

6.3 Building your first translator

Preparing the datasets

Training the model

Running the translator

6.4 How Seq2Seq models work

Encoder

Decoder

Greedy decoding

Beam search decoding

6.5 Evaluating translation systems

Human evaluation

Automatic evaluation

6.6 Case study: Building a chatbot

Introducing dialogue systems

Preparing a dataset

Training and running a chatbot

Next steps

7 Convolutional neural networks

7.1 Introducing convolutional neural networks (CNNs)

RNNs and their shortcomings

Pattern matching for sentence classification

Convolutional neural networks (CNNs)

7.2 Convolutional layers

Pattern matching using filters

Rectified linear unit (ReLU)

Combining scores

7.3 Pooling layers

7.4 Case study: Text classification

Review: Text classification

Using CnnEncoder

Training and running the classifier

8 Attention and Transformer

8.1 What is attention?

Limitation of vanilla Seq2Seq models

Attention mechanism

8.2 Sequence-to-sequence with attention

Encoder-decoder attention

Building a Seq2Seq machine translation with attention

8.3 Transformer and self-attention

Self-attention

Transformer

Experiments

8.4 Transformer-based language models

Transformer as a language model

Transformer-XL

GPT-2

XLM

8.5 Case study: Spell-checker

Spell correction as machine translation

Training a spell-checker

Improving a spell-checker

9 Transfer learning with pretrained language models

9.1 Transfer learning

Traditional machine learning

Word embeddings

What is transfer learning?

9.2 BERT

Limitations of word embeddings

Self-supervised learning

Pretraining BERT

Adapting BERT

9.3 Case study 1: Sentiment analysis with BERT

Tokenizing input

Building the model

Training the model

9.4 Other pretrained language models

ELMo

XLNet

RoBERTa

DistilBERT

ALBERT

9.5 Case study 2: Natural language inference with BERT

What is natural language inference?

Using BERT for sentence-pair classification

Using Transformers with AllenNLP

Part 3 Putting into production

10 Best practices in developing NLP applications

10.1 Batching instances

Padding

Sorting

Masking

10.2 Tokenization for neural models

Unknown words

Character models

Subword models

10.3 Avoiding overfitting

Regularization

Early stopping

Cross-validation

10.4 Dealing with imbalanced datasets

Using appropriate evaluation metrics

Upsampling and downsampling

Weighting losses

10.5 Hyperparameter tuning

Examples of hyperparameters

Grid search vs. random search

Hyperparameter tuning with Optuna

11 Deploying and serving NLP applications

11.1 Architecting your NLP application

Before machine learning

Choosing the right architecture

Project structure

Version control

11.2 Deploying your NLP model

Testing

Train-serve skew

Monitoring

Using GPUs

11.3 Case study: Serving and deploying NLP applications

Serving models with TorchServe

Deploying models with SageMaker

11.4 Interpreting and visualizing model predictions

11.5 Where to go from here

index
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.55.198