Beyond Euclidean Summaries: Online Change Point Detection for Distribution-Valued Data

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing online change-point detection methods, which rely on fixed-dimensional Euclidean summaries and struggle to capture complex shifts in distributional shape or geometric structure. The authors propose the first approach that models streaming batch data as a stochastic process in the 2-Wasserstein space. By leveraging Fréchet mean-centered tangent space mappings, the method achieves local linearization, enabling intrinsic distribution-valued modeling to detect subtle changes in probability laws. Integrating multivariate monitoring statistics for sequential detection, the proposed framework consistently reduces detection delay at a fixed in-control average run length (ARL₀) across both synthetic and real-world datasets, effectively identifying intricate distributional shifts that conventional moment-based features often overlook.

Technology Category

Application Category

📝 Abstract
Existing online change-point detection (CPD) methods rely on fixed-dimensional Euclidean summaries, implicitly assuming that distributional changes are well captured by moment-based or feature-based representations. They can obscure important changes in distributional shape or geometry. We propose an intrinsic distribution-valued CPD framework that treats streaming batch data as a stochastic process on the 2-Wasserstein space. Our method detects changes in the law of this process by mapping each empirical distribution to a tangent space relative to a pre-change Fr\'echet barycenter, yielding a reference-centered local linearization of 2-Wasserstein space. This representation enables sequential detectors by adapting classical multivariate monitoring statistics to tangent fields. We provide theoretical guarantees and demonstrate, via synthetic and real-world experiments, that our approach detects complex distributional shifts with reduced detection delay at matched $\mathrm{ARL}_0$ compared with moments-based and model-free baselines.
Problem

Research questions and friction points this paper is trying to address.

change-point detection
distribution-valued data
Wasserstein space
distributional shift
online monitoring
Innovation

Methods, ideas, or system contributions that make the work stand out.

distribution-valued data
online change point detection
2-Wasserstein space
Fréchet barycenter
tangent space embedding