🤖 AI Summary
This study evaluates the applicability of gravity models for counterfactual trade policy analysis—particularly assessing regional trade agreements—with primary focus on out-of-sample predictive performance. Methodologically, it systematically compares three conventional gravity specifications (including a three-dimensional extension) against machine learning approaches—namely random forests and neural networks—in predicting bilateral trade flows, using out-of-sample forecast accuracy as the principal evaluation criterion. Results indicate that the three-dimensional gravity model delivers greater robustness in predicting aggregate trade means, whereas ensemble machine learning methods significantly outperform traditional specifications in forecasting individual bilateral flows, especially in capturing nonlinearities and heterogeneity. Crucially, this paper is the first to establish predictive performance as a key validity criterion for gravity-model-based policy simulations. By doing so, it provides a methodological benchmark and practical guidance for model selection in quantitative trade policy analysis.
📝 Abstract
Gravity equations are often used to evaluate counterfactual trade policy scenarios, such as the effect of regional trade agreements on trade flows. In this paper, we argue that the suitability of gravity equations for this purpose crucially depends on their out-of-sample predictive power. We propose a methodology that compares different versions of the gravity equation, both among themselves and with machine learning-based forecast methods such as random forests and neural networks. We find that the 3-way gravity model is difficult to beat in terms of out-of-sample average predictive performance, further justifying its place as the predominant tool for applied trade policy analysis. However, when the goal is to predict individual bilateral trade flows, the 3-way model can be outperformed by an ensemble machine learning method.