🤖 AI Summary
This paper addresses the pronounced out-of-sample Sharpe ratio decay of linear predictive models in quantitative trading strategies due to overfitting. We derive, for the first time, closed-form approximations for both in-sample and out-of-sample Sharpe ratios. Methodologically, we integrate statistical inference, random matrix theory, and the Garleanu–Pedersen multi-asset futures empirical framework to systematically characterize how signal strength distribution, number of assets, and training sample size affect strategy replication fidelity. Theoretical analysis reveals a sharp decline in out-of-sample replication fidelity in high-dimensional, multi-asset settings. Empirical validation confirms that increasing training data significantly improves replication fidelity, with gains that are both statistically and economically significant. Our core contribution is a tractable, testable analytical framework for quantifying and attributing overfitting risk in linear trading strategies—enabling rigorous diagnosis, calibration, and robustness assessment.
📝 Abstract
We study how much the in-sample performance of trading strategies based on linear predictive models is reduced out-of-sample due to overfitting. More specifically, we compute the in- and out-of-sample means and variances of the corresponding PnLs and use these to derive a closed-form approximation for the corresponding Sharpe ratios. We find that the out-of-sample ``replication ratio'' diminishes for complex strategies with many assets and based on many weak rather than a few strong trading signals, and increases when more training data is used. The substantial quantitative importance of these effects is illustrated with an empirical case study for commodity futures following the methodology of Garleanu-Pedersen.