🤖 AI Summary
This study addresses the failure of conventional inference methods in small samples for switchback experiments, where temporal autocorrelation, seasonality, heavy-tailed shocks, and lagged or anticipatory effects complicate valid statistical testing. The authors propose a conditional randomization test framework that leverages the known assignment mechanism and requires no parametric assumptions on the outcome process. By introducing two key assumptions—non-anticipativity and a finite lag window—the design is partitioned into tractable “segments,” enabling the construction of finite-sample valid, distribution-free p-values. The framework accommodates studentized tests for weak null hypotheses to handle within-session seasonality and enhances robustness through diagnostic checks on the lag window and non-anticipativity. Simulations demonstrate that the method tightly controls Type I error while substantially outperforming existing alternatives, and it naturally extends to other time-indexed experimental designs.
📝 Abstract
Switchback experiments--alternating treatment and control over time--are widely used when unit-level randomization is infeasible, outcomes are aggregated, or user interference is unavoidable. In practice, experimentation must support fast product cycles, so teams often run studies for limited durations and make decisions with modest samples. At the same time, outcomes in these time-indexed settings exhibit serial dependence, seasonality, and occasional heavy-tailed shocks, and temporal interference (carryover or anticipation) can render standard asymptotics and naive randomization tests unreliable. In this paper, we develop a randomization-test framework that delivers finite-sample valid, distribution-free p-values for several null hypotheses of interest using only the known assignment mechanism, without parametric assumptions on the outcome process. For causal effects of interests, we impose two primitive conditions--non-anticipation and a finite carryover horizon m--and construct conditional randomization tests (CRTs) based on an ex ante pooling of design blocks into "sections," which yields a tractable conditional assignment law and ensures imputability of focal outcomes. We provide diagnostics for learning the carryover window and assessing non-anticipation, and we introduce studentized CRTs for a session-wise weak null that accommodates within-session seasonality with asymptotic validity. Power approximations under distributed-lag effects with AR(1) noise guide design and analysis choices, and simulations demonstrate favorable size and power relative to common alternatives. Our framework extends naturally to other time-indexed designs.