8

Longer-Term Risk Forecasting and Something About Dilequants

When we use standard deviation as our measure of risk, we assume that returns are normally distributed, which is not always the case, and we assume that investors follow a peculiar decision process.

—JPP

PREVIOUS RESEARCH ON MANAGED VOLATILITY AND covered call writing and Poon and Granger’s review of 93 papers on risk forecasting—the mega, multiauthor horse race I mentioned in my introduction to this risk forecasting—all focus on short-term risk forecasts, mostly between one-hour and one-month horizons. For strategic asset allocation and many tactical asset allocation processes, these forecast horizons aren’t long enough.

There also appears to be an unspoken assumption among the authors of these studies that the more complex the model, the better. Meanwhile, basic questions are left unanswered. For example, it’s not always clear how to calibrate the models, let alone address questions on the lookback window (one month? one year? five years? more?) and data frequency (daily, weekly, or monthly?), irrespective of the volatility-forecasting model.

To address these issues, I asked my colleague Rob Panariello to set up our own study. Our goal was to help investors solve more mundane yet important questions on how to use the data available to them.

We wanted to directly study the persistence of other aspects of risk beyond volatility—the so-called higher moments. If we observe that losses tend to be greater than gains (“negative skewness”), should we expect the same going forward? And if we observe fat tails (“high kurtosis”) in asset returns, should we expect fat tails going forward? In other words, while we know that volatility is persistent, are higher moments persistent (and therefore predictable) too? Or do they mean-revert? These higher moments matter because they can drastically change exposure to loss.

I prepared the data for Rob, using the same universe of asset classes and the same data sources as for the other similar analyses I’ve mentioned so far in this book. The dataset covers 33 data series: 20 asset classes (10 equity and 10 fixed income asset classes) and 12 relative bets (6 equity bets and 6 fixed income bets), as well as the stocks versus bonds relative bet. The list of equity asset classes and relative bets is on page 53, and the list for fixed income is on page 75. The start date for the daily and weekly dataset is August 30, 2000, and the end date is September 5, 2018. Monthly data start on February 28, 1993, and end on August 31, 2018. These dates are based on data availability.

I asked Rob if he could calculate the correlation between past and future volatility, skewness, and kurtosis for each of these 33 return series. I also asked him to vary the trailing windows as follows: one month (21 days), six months, one year, three years, and five years. I asked him to vary the forecast horizons in the same way and to repeat the entire experiment for daily, weekly, and monthly data.

For example, I asked Rob to calculate the correlation between (a) volatility as calculated over the last year based on weekly data and (b) volatility over the next month based on daily data. We then repeated the same experiment but looked at volatility over the next six months, one year, three years, and five years, looked at another asset class or relative bet, changed the lookback window, switched to monthly lagged data, and so on. The goal was to uncover what would have been the best way to forecast risk based on our data sample. We also wanted to uncover general rules of thumb (patterns in our results) about how to use data to build a risk forecast for asset allocation.

The number of experiments exploded. When all was said and done, we realized we wanted to look at the persistence in risk measures in 19,305 different ways: 195 lead-lag-data frequency combinations across three moments (volatility, skewness, and kurtosis) for 33 asset classes and bets. Rob’s prodigious mathematical brain went into hyperdrive.

“I have to think about how to code this massive experiment and how to store and summarize the results,” Rob said.

“Agreed. How long do you think it will take?” I asked.

“Let me get back to you,” Rob answered.

The next day, he walked into my office. “It’s done,” he said, dropping the sample results on my desk. It only took him a few hours on Matlab. It would have taken me two or three weeks full-time to complete the project. I’m a “dilequant”—Mark Kritzman’s term for those who dabble in quantitative analysis.

Results from Our Risk Forecasting Horse Race

It’s difficult to summarize results for 19,305 risk predictability tests. This analysis revealed several important takeaways, supported by persistent patterns in our results. The conclusions differed depending on whether we looked at short-to-medium time horizons (one month to three years) or the longer horizon of five years.

First, persistence in volatility was highest at short horizons. It’s easier to forecast volatility one month ahead than one year ahead, and so on. Across all combinations of data windows, forecast horizons, and data frequencies, month-to-month volatility based on daily data showed the most persistence. The average correlation between this month’s daily volatility and next month’s was +68%, with little variation across asset classes and bets. It was the highest positive correlation across all risk forecast permutations for 32 out of 33 asset classes or bets—a remarkably consistent result.

The only exception was Treasuries, for which the six-month estimation window of daily data produced slightly better (one month ahead) forecasts (+67% correlation) versus the one-month window (+65% correlation). These results seem to explain why managed volatility and actively managed covered call writing strategies rely on short-term volatility persistence.

A more surprising result was that shorter estimation windows also produced better forecasts at longer horizons. This result is important for asset allocation because it means we can also improve risk-adjusted returns without tactical volatility management. On average, across asset classes and bets, volatility measured over the most recent 21 days produced the best forecasts of medium horizon volatilities (six months, one year, and three years forward). Statisticians often claim that more data are better than less data and that we need long time periods to estimate something with “significance.” Here the intuition is the opposite. Recent data carry more information about the future than older data. It’s an oversimplification, but think of monetary policy, which often drives volatility in financial markets. Most of the time, the Fed’s actions over the last few weeks should have more impact on market volatility going forward than what it did a year ago, for example.

In that context, suppose we want to produce a forecast of weekly volatility over the next year. Based on our results, we should use the last 21 days of data (+42% correlation with one-year forward weekly volatility), which gives a better forecast than the last six months (+38%), which in turn is more accurate than the last year (+30%). This result—that shorter estimation windows work better—held for forecasts of daily, weekly, and monthly forward volatilities. It was consistent across asset classes and bets. For horizons from six months to three years, we found a similar pattern for all 33 asset classes. This result means that investors should pay significant attention to recent volatility.

On the other hand, when it came to data frequency (daily, weekly, or monthly), our experiments revealed that for the same estimation window, more data were better than less data. We found that for short- and medium- term horizons, estimation windows based on daily data worked better than ones based on weekly data, which in turn worked better than ones based on monthly data. We didn’t expect this result. We expected that it would be better to match data frequency between estimation and forecast. In other words, we expected that daily data would best forecast daily volatility, weekly data would best forecast weekly volatility, and so on. But we found that on average, daily data almost always produced the best forecasts, even for forward monthly volatility.

To summarize, the following patterns in short- and medium-term volatility held whether we looked at absolute or relative bets, and they held across equity and fixed income markets: persistence works best at short horizons, shorter estimation windows work better even for longer horizons, and higher- frequency data work better than lower-frequency data.

Mean Reversion at Longer Horizons

Another consistent pattern emerged at longer horizons: mean reversion. It started to appear when we used a three-year estimation window to forecast three-year volatility. It became strongest with the five-year estimation window for five-year forward realized volatility, across daily, weekly, and monthly frequencies.

When we look at longer time horizons, we must carefully interpret results because we automatically reduce the number of independent observations. In a 2019 article published in the Financial Analysts Journal titled “Long-Horizon Predictability: A Cautionary Tale,” Boudoukh, Israel, and Richardson explain that long-horizon forecasts aren’t always as accurate as they may seem. As we increase the time horizon, we reduce the number of non-overlapping data points, and the confidence intervals around our estimates widen.

Using overlapping data doesn’t add much explanatory power, because the variables are autocorrelated. For example, suppose we use rolling monthly data for five-year forecasts. In this case, our February 2018 forecast covers 58 of the 60 months from our January 2018 forecast. Only the first and last month are different, so we haven’t added much information to estimate the relationship. Therefore, in finance, all else being equal (particularly for the same time period), statistical estimates based on longer horizons—t-statistics, regression betas, or R-squares/correlations—are less reliable than shorter-horizon estimates and are more likely to show extreme values.

But the negative correlations in the five-year estimation—five-year realized “cells” in our results matrix—were remarkably strong. The averages (across asset classes and bets) in those cells ranged from –45% to –79%, depending on the data frequencies used for the estimation and realized volatilities. It was the strongest negative correlation—averaged across frequencies—for 32 out of 33 asset classes and bets.

This result has important implications for strategic asset allocation. Most institutional asset owners and individuals (or their advisors) use long windows of data to forecast risk when they establish their strategic asset mix. A five-year period of monthly data appears to be a popular estimation window. While it’s probably better to use longer estimation windows, several asset classes and funds don’t have long return histories. (The alternative space is one example.) There are ways to combine long and short histories—to “backfill” the missing data for the short series. For example, in 2013, I published an article in the Financial Analysts Journal titled “How to Combine Long and Short Return Histories Efficiently.” But investors often just cut the entire dataset at the beginning of the shortest time series.

A key takeaway from our results is that while we found persistence at short horizons, especially one month ahead, the results were the opposite for long horizons. Five years of calm markets are more likely to be followed by five years of turbulence, and vice versa. Therefore, risk estimates for strategic asset allocation may be seriously flawed if they extrapolate the past. It would be better to model this mean reversion directly.1

However, there’s an important caveat to these results on long-term mean reversion. In a follow-up experiment, we analyzed volatility persistence with a longer dataset (1926–2018, from Ken French’s website2). Our results, again, revealed strong short-term volatility persistence. These findings were robust to the new dataset. But long-run mean reversion was not as strong as we observed on data from 1993 to 2018. In fact, in the long sample, mean reversion only appeared between the trailing 10-year and forward 10-year windows, and not across all asset classes. (We were able to add the 10-year windows due to the longer dataset.) Otherwise, in this longer dataset, we found mild persistence in long-run volatility.

What explains the difference in the two samples? I suspect that central banks played a role in the more recent mean reversion. The tech bubble and the 2008–2009 crisis were both followed by periods of data-dependent economic stabilization driven by central banks. Ultimately, my view is that the 1993–2018 sample may be more representative of the future. Central banks should continue to be more data-dependent than they were many decades ago. The business cycle should continue to dominate market volatility, compared with other factors that drove volatility in the longer sample, such as runaway inflation, world wars, etc. In general, I question the common assumption that the longer the dataset, the more reliable our estimate of financial markets behavior in the current environment. (Back in 1926, computers and televisions did not exist, and many people still used the horse and buggy for transportation.)

What About Higher Moments?

As mentioned earlier, we also wanted to look at features of risk beyond volatility. Our results were disappointing for “higher moments” (skewness and kurtosis). We didn’t find much predictability. But first, the definitions of the terms “moments,” “skewness,” and “kurtosis” tell us something important about exposure to loss.

In finance, we try to predict the future. As Bernd Scherer said in response to the GIGO critique, if you can’t formulate expectations about the future, “you shouldn’t be in the investment business.” But because we don’t know the future with certainty, we must consider a range of future outcomes. To do so, we use probability distributions. The best way to represent a probability distribution graphically is to use histograms. The height of the bars in these histograms represents the likelihood of various outcomes, ordered from low to high on the x axis. The bars are usually high around the average outcome (say, for stocks, an expected return of 7%), and they get smaller and smaller as we move toward more extreme events (say, a return of –30% or +30%).

Moments are used to describe the shape of these probability distributions. The first moment is volatility, which we’ve discussed at length. The second moment, skewness, measures whether the distribution is symmetrical, as I’ve mentioned before. If large gains are more likely than large losses of the same magnitude, the distribution is positively skewed, which means that its “right tail” is longer than its “left tail.” A call option, for example, is positively skewed. Unfortunately, in financial markets, negatively skewed distributions are much more frequent than positively skewed distributions, such that losses of –10% tend to be more frequent than gains of +10%, for example.

A cheesy but surprisingly reliable joke about skewness that I like to use at conferences goes as follows:

Recently, I’ve been working on a paper on negative skewness and portfolio construction. We explain how naïve financial advisors tend to load up client portfolios with negatively skewed assets like credit and hedge funds, just because these assets have relatively low volatility and high returns. In doing so, they ignore the mismatch between volatility (which is low) and exposure loss (which is high). I thought I had a great title idea for the paper: “Skew You! Says Your Financial Advisor.” But my marketing compliance department didn’t like it.

Like many of my conference witticisms, I may have stolen this one from Mark Kritzman.

The third moment, kurtosis, measures whether the tails are “fat,” i.e., whether the probabilities of large losses or gains are higher than we would expect from a normal distribution, irrespective of the symmetry or lack thereof in the distribution.

These so-called higher moments, skewness and kurtosis, matter a great deal when we try to forecast exposure to loss. In that context, our results were problematic. We found almost no predictability and no persistent pattern in either skewness or kurtosis, except for some mean reversion at the five-year horizon (similar to what we found for volatility, albeit weaker).

My intuition on this mean reversion is that following big crises, investors and monetary authorities become more careful, and the regulator puts measures in place to prevent excesses. Similarly, when markets have been quiet for a long time, investors become complacent, and asset price bubbles start to form. Nonetheless, some emerging markets (for example) have experienced crises in rapid succession, and sometimes there’s a domino effect that can continue for several years.

In the end, we estimate higher moments based on very few, extreme data points, such as the tech bubble and the 2008 financial crisis. This lack of predictability and consistency across markets shouldn’t be surprising. Yet for asset allocation decisions, we can’t ignore higher moments. As we will see in Chapter 11, scenario analysis is a useful tool to address this problem. At the portfolio level, we can improve our estimation of higher moments if we account for how correlations change in down markets.

Notes

1.   In general, our analysis raises this question: Would a model that includes both short-term persistence and long-term mean reversion work better than most other models?

2.   mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.119.251