🤖 AI Summary
This paper addresses asymptotic inference for covariate-adjusted estimators under rerandomization and stratified rerandomization. Focusing on M-estimators—including g-computation and doubly robust methods—as well as data-adaptive machine learning estimators (e.g., LASSO, random forests), we establish, for the first time, that any M-estimator remains asymptotically linear under rerandomization, with its influence function unchanged. We propose a covariate-adjustment correction to restore asymptotic normality and prove efficiency optimality of machine learning–assisted estimation under rerandomization. Leveraging a superpopulation framework and influence function theory, we validate our approach via simulations and reanalysis of real-world cluster randomized trials. Corrected estimators achieve substantially improved precision while maintaining nominal coverage of confidence intervals. Both theoretical analysis and empirical evidence confirm that rerandomization preserves consistency and asymptotic efficiency without compromising inferential validity.
📝 Abstract
Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more generally to the class of M-estimators, including the g-computation formula with generalized linear regression and doubly-robust methods, and more broadly, to efficient estimators with data-adaptive machine learners. In this paper, under a super-population framework, we develop the asymptotic theory for a more general class of covariate-adjusted estimators under rerandomization and its stratified extension. We prove that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. We further explain, drawing examples from several common M-estimators, that asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. Finally, we study the asymptotic theory for efficient estimators based on data-adaptive machine learners, and prove their efficiency optimality under rerandomization and stratified rerandomization. Our results are demonstrated via simulations and re-analyses of a cluster-randomized experiment that used stratified rerandomization.