🤖 AI Summary
Existing e-BH procedures lack order-invariance over e-processes, causing test conclusions to reverse spuriously upon addition of irrelevant data and failing to control the false discovery rate (FDR) — or even the family-wise error rate (FWER) — under arbitrary dependence. This paper provides the first rigorous proof that e-BH violates FDR control in this setting.
Method: We propose a novel, order-invariant multiple testing framework built on e-process upper bounds, featuring a dependence-structure-adaptive calibrator.
Contribution/Results: Our method guarantees strict FDR control at level α (i.e., FDR-sup ≤ α) for arbitrary dependence structures among hypotheses. It eliminates temporal instability in rejection sets induced by sequential data arrival, ensuring robustness and reproducibility in dynamic data environments. Theoretical guarantees are established without restrictive assumptions on dependence, and the procedure is computationally tractable.
📝 Abstract
E-processes enable hypothesis testing with ongoing data collection while maintaining Type I error control. However, when testing multiple hypotheses simultaneously, current $e$-value based multiple testing methods such as e-BH are not invariant to the order in which data are gathered for the different $e$-processes. This can lead to undesirable situations, e.g., where a hypothesis rejected at time $t$ is no longer rejected at time $t+1$ after choosing to gather more data for one or more $e$-processes unrelated to that hypothesis. We argue that multiple testing methods should always work with suprema of $e$-processes. We provide an example to illustrate that e-BH does not control this FDR at level $alpha$ when applied to suprema of $e$-processes. We show that adjusters can be used to ensure FDR-sup control with e-BH under arbitrary dependence.