Adversarially Robust Topological Inference

πŸ“… 2022-06-03
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Outliers severely compromise the statistical consistency of persistent homology. Method: This paper introduces the Median-of-Means Distance (MoM Dist), the first distance function for topological data analysis (TDA) that incorporates the Median-of-Means (MoM) estimation paradigm from robust statistics. MoM Dist constructs a robust sublevel set filtration and a weighted filtration, enabling consistent estimation of the underlying population’s true topological structure even in the presence of outliers. Contribution/Results: We establish theoretical guarantees showing that the induced filtration satisfies strong consistency and near-minimax optimality. Empirical evaluations demonstrate that MoM Dist significantly outperforms standard distance functions under both stochastic noise and adversarial perturbations. This work establishes a new paradigm for robust topological inference, bridging robust statistics and TDA to enhance reliability in real-world, outlier-contaminated settings.
πŸ“ Abstract
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this work, we develop a framework of statistical inference for persistent homology in the presence of outliers. Drawing inspiration from recent developments in robust statistics, we propose a extit{median-of-means} variant of the distance function ( extsf{MoM Dist}) and establish its statistical properties. In particular, we show that, even in the presence of outliers, the sublevel filtrations and weighted filtrations induced by extsf{MoM Dist} are both consistent estimators of the true underlying population counterpart and exhibit near minimax-optimal performance in adversarial settings. Finally, we demonstrate the advantages of the proposed methodology through simulations and applications.
Problem

Research questions and friction points this paper is trying to address.

Develop robust statistical inference for persistent homology with outliers
Propose median-of-means distance function to handle adversarial settings
Ensure consistency and optimal performance in topological data analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Median-of-means variant for distance function
Consistent estimators for sublevel filtrations
Near minimax-optimal in adversarial settings
Siddharth Vishwanath
Siddharth Vishwanath
University of California San Diego
Statistical Learning TheoryTopological Data Analysis
B
Bharath K. Sriperumbudur
Department of Statistics, The Pennsylvania State University
K
K. Fukumizu
The Institute of Statistical Mathematics
S
S. Kuriki
The Institute of Statistical Mathematics