🤖 AI Summary
Hypergraphs face severe storage, transmission, and query-efficiency bottlenecks when modeling high-order relationships due to combinatorial size explosion. To address this, we propose HyperCSA—the first lossless hypergraph compression framework leveraging suffix sorting, enabling in-place queries without compromising standard query functionality. Its core innovations include: (i) encoding hyperedge sequences via a compressed suffix array (CSA) to implicitly represent adjacency relations, and (ii) designing lightweight metadata structures for efficient navigation. On large-scale real-world hypergraphs, HyperCSA achieves substantial compression ratios—reducing storage to only 26%–79% of the original size—while accelerating neighbor queries by 6×–40× over state-of-the-art methods. Moreover, it scales to ultra-large hypergraphs that existing approaches cannot process, demonstrating both theoretical soundness and practical viability for hypergraph analytics.
📝 Abstract
Hypergraphs model complex, non-binary relationships like co-authorships, social group memberships, and recommendations. Like traditional graphs, hypergraphs can grow large, posing challenges for storage, transmission, and query performance. We propose HyperCSA, a novel compression method for hypergraphs that maintains support for standard queries over the succinct representation. HyperCSA achieves compression ratios of 26% to 79% of the original file size on real-world hypergraphs - outperforming existing methods on all large hypergraphs in our experiments. Additionally, HyperCSA scales to larger datasets than existing approaches. Furthermore, for common real-world hypergraphs, HyperCSA evaluates neighbor queries 6 to 40 times faster than both standard data structures and other hypergraph compression approaches.