When Representative Samples Produce Worse Outcomes: Scale-up Decisions and Testing in Small-Budget RCTs

📅 2026-06-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study challenges the conventional wisdom that representative sampling is always optimal in randomized controlled trials (RCTs) with limited budgets. It proposes a novel optimal sampling framework that integrates cost structures and prior knowledge of heterogeneous treatment effects. By combining Bayesian priors, heterogeneity modeling, hypothesis testing, and expected utility optimization, the authors theoretically demonstrate that under tight budget constraints, concentrating sampling efforts on a single high-potential subpopulation can substantially enhance the expected impact of downstream interventions. Only when the budget is sufficiently large does the optimal strategy converge to representative sampling. These findings hold across diverse resource-constrained experimental settings and offer a new paradigm for efficient, evidence-based decision-making in applied research.

📝 Abstract

Small randomized controlled trials are often used to screen interventions before running larger follow-up studies. This is a critical phase of experimentation, as missing effective interventions or scaling up harmful ones can be very costly. A common proposal to mitigate these errors is to recruit samples that are representative of the target population, but this is often challenging in resource-constrained pilots. We challenge the narrative that representative samples are always superior by showing that when statistical significance testing determines whether interventions receive further study, the pilot trial composition that maximizes the downstream expected improvement in outcomes depends critically on its budget size. In the large-budget limit, the optimal pilot design converges to a sample that is representative of the target population. However, in the small-budget regime, the pilot designer maximizes expected impact by sampling only from a single homogeneous sub-population, chosen in a manner that depends on sampling costs and the designer's prior beliefs about heterogeneous treatment effects. Our proof of the small-budget result applies more generally when an RCT and significance test are used to decide whether to receive any non-adaptive downstream payoff, a result that may be applicable to other settings with constrained experimentation budgets.

Problem

Research questions and friction points this paper is trying to address.

randomized controlled trials

sample representativeness

small-budget experiments

treatment effect heterogeneity

scale-up decisions

Innovation

Methods, ideas, or system contributions that make the work stand out.

small-budget RCTs

representative sampling

heterogeneous treatment effects