2.1 Contrasting attributes of localist, one-hot representations of words with distributed, vector-based representations
2.2 Traditional machine learning and deep learning representations, by natural language element
8.1 Cross-entropy costs associated with selected example inputs
11.1 Comparison of word2vec architectures
11.2 The words most similar to select test words from our Project Gutenberg vocabulary
11.3 A confusion matrix
11.4 Four hot dog / not hot dog predictions
11.5 Four hot dog / not hot dog predictions, now with intermediate ROC AUC calculations
11.6 Comparison of the performance of our sentiment classifier model architectures
14.1 Fashion-MNIST categories