Next Word Prediction with Recurrent Neural Networks

So far, we've covered a number of basic neural network architectures and their learning algorithms. These are the necessary building blocks for designing networks that are capable of more advanced tasks, such as machine translation, speech recognition, time series prediction, and image segmentation. In this chapter, we'll cover a class of algorithms/architectures that excel at these and other tasks due to their ability to model sequential dependencies in the data.

These algorithms have proven to be incredibly powerful, and their variants have found wide application in industry and consumer use cases. This runs the gamut of machine translation, text generation, named entity recognition, and sensor data analysis. When you say Okay, Google! or Hey, Siri!, behind the scenes, a type of trained recurrent neural network (RNN) is doing inference. The common theme of all of these applications is that these sequences (such as sensor data at time x, or occurrence of a word in a corpus at position y) can all be modeled with time as their regulating dimension. As we will see, we can represent our data and structure our tensors accordingly.

A great example of a hard problem is natural language processing and comprehension. If we have a large body of text, say the collected works of Shakespeare, what might we be able to say about this text? We could elaborate on the statistical properties of the text, that is, how many words there are, how many of these words are unique, the total number of characters, and so on, but we also inherently know from our own experience of reading that an important property of text/language is sequence; that is, the order in which words appear. That order contributes to our understanding of syntax and grammar, not to mention meaning itself. It is when analyzing this kind of data that the networks we've covered so far fall short.

In this chapter, we will learn about the following topics:

What is a basic RNN
How to train RNNs
Improvements of the RNN architecture, including Gated Recurrent Unit (GRU)/Long Short-Term Memory (LSTM) networks
How to implement an RNN with LSTM units in Gorgonia

Table of Contents for Next Word Prediction with Recurrent Neural Networks

Create new playlist

Sign In

Sign Up

Table of Contents for
Next Word Prediction with Recurrent Neural Networks