🤖 AI Summary
In finite-population cluster randomized controlled trials (CRCTs), the design variance of the Horvitz–Thompson (HT) estimator for the average treatment effect (ATE) is not point-identifiable due to its dependence on unobserved joint potential outcomes at the cluster level. This paper derives, for the first time, the exact analytical expression of this variance under two-stage sampling and cluster-level intervention, formally characterizing its inherent non-identifiability. We propose a computable sharp upper bound on the variance and develop a consistent estimator for it. Compared to conventional cluster-robust standard errors, our bound is tighter, enabling narrower confidence intervals with reliable coverage. Simulation studies and empirical analyses confirm the consistency of the proposed estimator and the nominal coverage of the resulting intervals. This work establishes a theoretical foundation and provides practical tools for design-based exact inference in CRCTs.
📝 Abstract
In cluster randomized controlled trials (CRCT) with a finite populations, the exact design-based variance of the Horvitz-Thompson (HT) estimator for the average treatment effect (ATE) depends on the joint distribution of unobserved cluster-aggregated potential outcomes and is therefore not point-identifiable. We study a common two-stage sampling design-random sampling of clusters followed by sampling units within sampled clusters-with treatment assigned at the cluster level. First, we derive the exact (infeasible) design-based variance of the HT ATE estimator that accounts jointly for cluster- and unit-level sampling as well as random assignment. Second, extending Aronow et al (2014), we provide a sharp, attanable upper bound on that variance and propose a consistent estimator of the bound using only observed outcomes and known sampling/assignment probabilities. In simulations and an empirical application, confidence intervals based on our bound are valid and typically narrower than those based on cluster standard errors.