Data integration

The value of data for an investment strategy often depends on combining complementary sources of market, fundamental and alternative data. We saw that the predictive power of ML algorithms like tree-based ensembles or neural networks is in part due to their ability to detect non-linear relationships, in particular, interaction effects among variables.

The ability to modulate the impact of a variable as a function of other model features thrives on data inputs that capture different aspects of a target outcome. The combination of asset prices with macro fundamentals, social sentiment, credit card payment, and satellite data will likely yield significantly more reliable predictions throughout different economic and market regimes than each source on its own (provided there the data is large enough to learn the hidden relationships).

Working with data from multiple sources increases the challenges of proper labeling. It is vital to assign accurate timestamps to avoid a lookahead bias by testing an algorithm with data before it actually became available. Data, for example, may have timestamps assigned by a provider that require adjustments to reflect the point in time when they would have been available for a live algorithm.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.212.99