Measuring performance

After 10 epochs in a stateful configuration, our loss has stopped improving and our network is fairly well trained, as you can see from the following graph:

We have a fit network that appears to have learned something. We can now make some sort of prediction as to the price flow of bitcoin. If we're able to do it well, we will all be very rich. Before we go buy that mansion, we should probably measure our model's performance.

The ultimate test of a financial model is this question: Are you willing to put money on it? It's difficult to answer this question because measuring performance in a time series problem can be challenging.

One very simple way to measure performance would be to use a root mean squared error to evaluate the difference between y_test and a prediction on X_test. We most certainly could do that, as shown in the following code:

RMSE = 0.0801932157201

Is 0.08 a good score? Let's start our investigation into good by comparing our predictions, against the actual values for bitcoin flow in June. Doing so might give us some visual intuition behind the the model's performance and is a practice I always recommend:

Our predictions, in green, leave quite a bit to be desired. Our model has learned to predict average flow, but it's really doing a very poor job at matching the full signal. It's even possible we might just be learning a trend, because of our less than vigorous detrending we did. I think we might have to put that mansion off a bit longer, but we're on the right path.

Consider our prediction as the model explaining as much of the price of bitcoin as possible, given only the previous value of bitcoin. We are probably doing a fairly good job of modeling the autoregressive parts of the time series. But, there are likely many different external factors that impact the price of bitcoin. The value of the dollar, the movement of other markets, and perhaps, most importantly, the buzz or information flow around bitcoin, are all likely play an important role in it's price.

And that's where the power of LSTMs for time series prediction really come into play. By adding additional input features, all of this information can be somewhat easily added to the model, hopefully explaining more and more of the entire picture.

But, let me dash your hopes one more time. A more thorough investigation on performance would also include consideration for the lift the model provides over some naive model. Typical choices for this simple model might include something called a random walk model, an exponential smoothing model, or possibly by using a naive approach such as using the previous time step as the prediction for the current time step. This is illustrated in the following graph:

In this graph, we're comparing our predictions in red, to a model where we're just using the previous minute as the prediction for the next minute, in green. In blue, the actual price, overlays this naive model almost perfectly. Our LSTM prediction isn't nearly as good as the naive model. We would be better off by just using the last minute's price to predict the current minute's price. While I stand by the assertion that we're on the right track, we have a long way to go before that boat is ours.

Modeling any commodity is very difficult. Using a deep neural networks for this type of problem is promising to be sure, but the problem is not an easy one. I'm including this perhaps exhaustive explanation so that if you decide to head down this path, you understand what you're in for.

That said, when you do use an LSTM to arbitrage a financial market, please remember to tip your author.

Table of Contents for Measuring performance

Create new playlist

Sign In

Sign Up

Table of Contents for
Measuring performance