Collecting and preparing the data

We already addressed important aspects of the sourcing of market, fundamental, and alternative data, and will continue to work with various examples of these sources as we illustrate the application of the various models. 

In addition to market and fundamental data that we will access through the Quantopian platform, we will also acquire and transform text data as we explore natural language processing and image data when we look at image processing and recognition. Besides obtaining, cleaning, and validating the data to relate it to trading data typically available in a time-series format, it is important to store it in a format that allows for fast access to enable quick exploration and iteration. We have recommended the HDF and parquet formats. For larger data volumes, Apache Spark represents the best solution.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.111.9