Continuous Monitoring via Repeated Significance

📅 2024-08-05

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In unbounded A/B testing—where the stopping time is unspecified a priori—there exists a fundamental tension between enabling early stopping and ensuring late-stage detection of statistically significant effects. Method: This paper proposes a sequential monitoring framework based on repeated significance testing. We theoretically establish that, under the unbounded setting, maintaining a strict constant significance level is infeasible but can be arbitrarily approximated. Leveraging this insight, we construct an adaptive p-value boundary that dynamically controls the family-wise Type I error rate in a data-driven manner, eliminating dependence on prespecified sample size or effect magnitude. The method integrates sequential analysis with statistical generalization bounds to ensure both high statistical power and real-time decision-making capability. Results: Empirical evaluation demonstrates that our framework significantly outperforms classical fixed-sample and SPRT approaches in robustness, sensitivity to early signals, and power for late-stage discovery—without requiring a holdout validation set.

Technology Category

Application Category

📝 Abstract

Requiring statistical significance at multiple interim analyses to declare a statistically significant result for an AB test allows less stringent requirements for significance at each interim analysis. Repeated repeated significance competes well with methods built on assumptions about the test -- assumptions that may be impossible to evaluate a priori and may require extra data to evaluate empirically. Instead, requiring repeated significance allows the data itself to prove directly that the required results are not due to chance alone. We explain how to apply tests with repeated significance to continuously monitor unbounded tests -- tests that do not have an a priori bound on running time or number of observations. We show that it is impossible to maintain a constant requirement for significance for unbounded tests, but that we can come arbitrarily close to that goal.

Problem

Research questions and friction points this paper is trying to address.

Continuous monitoring faces impossibility for unbounded length tests

Early stopping conflicts with maintaining later statistical power

Repeated significant results approach constant significance requirement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Repeated significant results for unbounded tests

Arbitrarily close to constant significance requirement

Continuous monitoring with early stopping capability

🔎 Similar Papers

No similar papers found.

Authors to Follow