Part 3 – natural language processing

Part three focuses on text data and introduces state-of-the-art unsupervised learning techniques to extract high-quality signals from this key source of alternative data.

Chapter 13Working with Text Datademonstrates how to convert text data into a numerical format and applies the classification algorithms from part two for sentiment analysis to large datasets. Chapter 14Topic Modeling, applies Bayesian unsupervised learning to extract latent topics that can summarize a large number of documents and offer more effective ways to explore text data or use topics as features for a classification model. It demonstrates how to apply this technique to earnings call transcripts sourced in Chapter 3Alternative Data for Finance, and to annual reports filed with the Securities and Exchange Commission (SEC).

Chapter 15Word Embeddings, uses neural networks to learn state-of-the-art language features in the form of word vectors that capture semantic context much better than traditional text features and represent a very promising avenue for extracting trading signals from text data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.105.239