🤖 AI Summary
Addressing the challenges of modeling overlapping, nested, and scale-heterogeneous hyperedges in hypergraphs—and the lack of theoretical guarantees for global structural inference—this paper proposes the first sample-to-population estimation framework based on implicit hyperbolic space embedding. The method directly maps hypergraph units onto a hyperbolic manifold, enabling scalable parameter learning via probabilistic modeling and manifold-constrained optimization. Crucially, it establishes identifiability and statistical consistency guarantees under non-asymptotic sampling—a theoretical first for hypergraph embedding. The model efficiently uncovers core-periphery topologies and hierarchical proximity relationships among nodes. Empirical evaluation on real-world political media data successfully recovers the hierarchical organization and tightly coupled affiliation patterns within the U.S. political network, demonstrating its effectiveness, interpretability, and scalability to million-node sparse hypergraphs.
📝 Abstract
Hypergraphs are useful mathematical representations of overlapping and nested subsets of interacting units, including groups of genes or brain regions, economic cartels, political or military coalitions, and groups of products that are purchased together. Despite the vast range of applications, the statistical analysis of hypergraphs is challenging: There are many hyperedges of small and large sizes, and hyperedges can overlap or be nested. We develop a novel statistical approach to hypergraphs with overlapping and nested hyperedges of varying sizes and levels of sparsity, which is amenable to scalable sample-to-population estimation with non-asymptotic theoretical guarantees. First, we introduce a probabilistic framework that embeds the units of a hypergraph in an unobserved hyperbolic space capturing core-periphery structure along with local structure in hypergraphs. Second, we develop scalable manifold optimization algorithms for learning hyperbolic space models based on samples from a hypergraph. Third, we show that the positions of units are identifiable (up to rotations) and provide non-asymptotic theoretical guarantees based on samples from a hypergraph. We use the framework to detect core-periphery structure along with proximity among U.S. politicians based on historical media reports.