I was asked the following question the other day on White’s reality check.

QUESTION:

I was reading White’s (2000) paper (http://www.ssc.wisc.edu/~bhansen/718/White2000.pdf). It seems to suggest that to examine the pnl characteristic of a strategy, we should do bootstrapping on the P&L. I am wondering why this makes sense. Why scramble the P&L time series? Why not scramble the original returns series, e.g., returns of S&P 500, paste the segments together, and feed the “fake” returns series into the strategy again? Then you can generate as many P&L time series as you like. Then you can use them to compute the expected P&L, Sharpe-ratio, etc. Would the 2nd approach be more realistic than the approach suggested in the paper?

ANSWER:

1) In the statistical literature, there are basically two approaches to tackling the data snooping bias problem. The first and more popular approach focuses on data. It tries to avoid re-using the same data set and can be done by testing a particular model on randomly chosen subsamples of the original time series or bootstrapping the original time series (e.g. return). However, one may argue that this sampling approach is somewhat arbitrary and may lack desired objectivity. See Chan, Karceski, and Lakonishok (1998) and Brock, Lakonishok, and Lebaron (1992) for examples of this sampling approach. Your proposed method basically falls into this category.

The second and more “formal” approach focuses on models by considering all relevant models (trading strategies) and constructing a test with properly controlled type I error (test size). Unfortunately, this method is not feasible when the number of strategies being tested is large. White’s paper basically follows the latter approach but he did in a smart way which does not suffer from the aforementioned problem.

2) However, there’s also a problem associated with White’s reality check. It may reduce rejection probabilities of the test under the null by the inclusion of poor and irrelevant alternative models (trading strategies) because it doesn’t satisfy a relevant similarity condition that is necessary for a test to be unbiased. I did some research and found the following paper by Hansen, which attempts to amend this problem by modifying White’s reality check.

Hansen, P. R. (2005), “A Test for Superior Predictive Ability”, Journal of Business & Economic Statistics, 23, pp. 365-380.

3) Basically, the emphasis of using these two approaches is different. Your proposed method tries to answer the following question: will this particular strategy be profitable on a different data set that exhibits similar characteristics to the original one? White’s (2000) reality check, on the other hand, tries to answer the following question: is at least one trading strategy (out of a pool of many seemingly profitable strategies) really profitable for this particular data set? I think a good and conservative approach would be: (a) answer White’s question first and if the answer is YES, then proceed to (b) answer your question by bootstrapping the return time series (and other covariates).

Very helpful, I've needed this explanation for 15 years!