🤖 AI Summary
This work addresses the challenge of topological representation for large-scale geometric data, where existing methods—such as Mapper and geometric simplicial complexes—suffer from deteriorating performance at scale, heavy reliance on manual hyperparameter tuning, and limited expressivity in capturing high-dimensional topological features. We propose the first differentiable optimization framework that learns topology-preserving covers end-to-end, eliminating dependence on predefined geometric structures or hand-crafted design. Our core innovation is modeling the cover as a differentiable neural network and introducing topologically grounded loss functions—e.g., nerve consistency—to enable gradient-based parameterization and optimization of covering subsets. The resulting simplicial complexes are more compact yet exhibit superior preservation of high-dimensional topology. Extensive evaluation demonstrates significant improvements over traditional methods on large datasets, with enhanced scalability, robustness to data scale, and fidelity to underlying topological structure.
📝 Abstract
Classical unsupervised learning methods like clustering and linear dimensionality reduction parametrize large-scale geometry when it is discrete or linear, while more modern methods from manifold learning find low dimensional representation or infer local geometry by constructing a graph on the input data. More recently, topological data analysis popularized the use of simplicial complexes to represent data topology with two main methodologies: topological inference with geometric complexes and large-scale topology visualization with Mapper graphs -- central to these is the nerve construction from topology, which builds a simplicial complex given a cover of a space by subsets. While successful, these have limitations: geometric complexes scale poorly with data size, and Mapper graphs can be hard to tune and only contain low dimensional information. In this paper, we propose to study the problem of learning covers in its own right, and from the perspective of optimization. We describe a method for learning topologically-faithful covers of geometric datasets, and show that the simplicial complexes thus obtained can outperform standard topological inference approaches in terms of size, and Mapper-type algorithms in terms of representation of large-scale topology.