🤖 AI Summary
This work addresses the limitation of traditional conformal prediction, which relies on a fixed sample size and thus fails to provide valid coverage guarantees at arbitrary time points in streaming data settings. The authors extend conformal prediction and the PAC framework to the sequential setting, achieving— for the first time—time-uniform prediction sets that remain valid under dynamically updated models and even when evaluation occurs at data-dependent stopping times. Their approach integrates time-uniform conformal theory, sequential hypothesis testing, and probabilistic inequalities to construct prediction intervals with guaranteed coverage at any stopping time. Empirical evaluations on both synthetic and real-world datasets demonstrate the method’s theoretical soundness and practical utility.
📝 Abstract
Given that machine learning algorithms are increasingly being deployed to aid in high stakes decision-making, uncertainty quantification methods that wrap around these black box models such as conformal prediction have received much attention in recent years. In sequential settings, where data are observed/generated in a streaming fashion, traditional conformal methods do not provide any guarantee without fixing the sample size. More importantly, traditional conformal methods cannot cope with sequentially updated predictions. As such, we develop an extension of the conformal prediction and related probably approximately correct (PAC) prediction frameworks to sequential settings where the number of data points is not fixed in advance. The resulting prediction sets are anytime-valid in that their expected coverage is at the required level at any time chosen by the analyst even if this choice depends on the data. We present theoretical guarantees for our proposed methods and demonstrate their validity and utility on simulated and real datasets.