Compressing Hypergraphs using Suffix Sorting

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

Hypergraphs face severe storage, transmission, and query-efficiency bottlenecks when modeling high-order relationships due to combinatorial size explosion. To address this, we propose HyperCSA—the first lossless hypergraph compression framework leveraging suffix sorting, enabling in-place queries without compromising standard query functionality. Its core innovations include: (i) encoding hyperedge sequences via a compressed suffix array (CSA) to implicitly represent adjacency relations, and (ii) designing lightweight metadata structures for efficient navigation. On large-scale real-world hypergraphs, HyperCSA achieves substantial compression ratios—reducing storage to only 26%–79% of the original size—while accelerating neighbor queries by 6×–40× over state-of-the-art methods. Moreover, it scales to ultra-large hypergraphs that existing approaches cannot process, demonstrating both theoretical soundness and practical viability for hypergraph analytics.

Technology Category

Application Category

📝 Abstract

Hypergraphs model complex, non-binary relationships like co-authorships, social group memberships, and recommendations. Like traditional graphs, hypergraphs can grow large, posing challenges for storage, transmission, and query performance. We propose HyperCSA, a novel compression method for hypergraphs that maintains support for standard queries over the succinct representation. HyperCSA achieves compression ratios of 26% to 79% of the original file size on real-world hypergraphs - outperforming existing methods on all large hypergraphs in our experiments. Additionally, HyperCSA scales to larger datasets than existing approaches. Furthermore, for common real-world hypergraphs, HyperCSA evaluates neighbor queries 6 to 40 times faster than both standard data structures and other hypergraph compression approaches.

Problem

Research questions and friction points this paper is trying to address.

Compressing large hypergraphs for efficient storage and transmission

Supporting standard queries on compressed hypergraph representations

Improving query performance and scalability over existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses suffix sorting for hypergraph compression

Achieves high compression ratios effectively

Enables faster neighbor queries performance

🔎 Similar Papers

Minimal Algorithmic Information Loss Methods for Dimension Reduction, Feature Selection and Network Sparsification.

2018-02-16Citations: 15

TikTok

San Jose, California

Research Scientist