5 - Sequence-to-sequence models for building chatbots

We're learning a lot and doing some valuable work! In our hypothetical business use case evolution this chapter builds directly on chapter 4 where we created our Natual Language Processing Pipeline. The skills we learned so far in computational linguistics should give us the confidence to expand past the training examples in this book to tackle this next project. We're going to build a more advanced chatbot for our hypothetical restaurant chain to automate the process of fielding call in orders.

This requirement would mean that we'd have to combine a number of technologies that we've learned so far. But for this project, we'll be interested in learning how to make a chatbot that is more contextually aware and robust that we could integrate into a larger system in this hypothetical. By demonstrating mastery on this training example, we'll have the confidence to execute this in a real situation.

In the previous chapters, we learned about representational learning methods like word2vec and using them in combination with a type of deep learning algorithm called a convolutional neural network (CNN's). But there are few constraints while using CNN's to build language models such as:

It will not be able to preserve the state information.
The length of sentences needs to be of fixed size for both inputs and outputs.
CNN's are sometimes unable to adequately handle complex sequential context.
RNN's do better at modeling information in sequence.

So to overcome all these problems we have an alternative algorithm which is specially designed to handle input data which comes in forms of sequences (like the sequence of words, the sequence of characters). This class of algorithm is called Recurrent Neural networks (RNN).

In this chapter, we will:

Learn about the RNN and its various forms
Create a language model implementation using RNN
Build our intuition on the Long-Short Term Memory (LSTM) Model
Create an LSTM language model implementation and compare it to the RNN
Implement an Encoder-Decoder RNN based on the LSTM unit for a simple sequence to sequence question answer task

Define the Goal:
Build a more robust chatbot with memory to provide more contextually correct responses to questions.

Let's get started!

Table of Contents for 5 - Sequence-to-sequence models for building chatbots

Create new playlist

Sign In

Sign Up

Table of Contents for
5 - Sequence-to-sequence models for building chatbots