🤖 AI Summary
This paper addresses the optimal sequencing of theory building and empirical analysis: under conditions of mature theoretical frameworks and abundant data, is the conventional “theory-first, then testing” approach still optimal? The authors formally model the trade-off between Darwinian learning (theory-driven) and statistical learning (data-driven) within a Bayesian framework, constructing a sequential information design model. Their analysis demonstrates that posterior theorization—empirically testing data before developing theoretical explanations—yields superior explanatory power and higher probability of discovering robust regularities when both data abundance and theoretical prior strength are high. This finding challenges dominant methodological paradigms such as pre-registration, which emphasize ex ante theoretical specification, and provides a new normative foundation for the logical starting point of empirical economic research.
📝 Abstract
For many economic questions, the empirical results are not interesting unless they are strong. For these questions, theorizing before the results are known is not always optimal. Instead, the optimal sequencing of theory and empirics trades off a ``Darwinian Learning'' effect from theorizing first with a ``Statistical Learning'' effect from examining the data first. This short paper formalizes the tradeoff in a Bayesian model. In the modern era of mature economic theory and enormous datasets, I argue that post hoc theorizing is typically optimal.