Efficient Top-k s-Biplexes Search over Large Bipartite Graphs

📅 2024-09-27

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This paper addresses the Top-k s-biplex mining problem on large-scale bipartite graphs, where an s-biplex requires each vertex to miss at most s neighbors from the opposite partition. The problem is NP-hard, and exhaustive enumeration of all s-biplexes is computationally infeasible. To tackle it, we formally define the Top-k Bipartite s-biplex Search (TBS) problem and propose MVBP, a branch-and-bound algorithm. MVBP incorporates three novel acceleration techniques: 2-hop decomposition, one-sided pruning bounds, and progressive search—collectively reducing the theoretical time complexity to O*(γₛᵈ²), where γₛ < 2 and d₂ ≪ |V|. Experiments on eight real-world and synthetic datasets—including AmazonRatings with over 3 million vertices—demonstrate that FastMVBP achieves up to three orders-of-magnitude speedup over state-of-the-art baselines; notably, d₂ remains as low as 67, substantially enhancing scalability and practical applicability.

Technology Category

Application Category

📝 Abstract

In a bipartite graph, a subgraph is an $s$-biplex if each vertex of the subgraph is adjacent to all but at most $s$ vertices on the opposite set. The enumeration of $s$-biplexes from a given graph is a fundamental problem in bipartite graph analysis. However, in real-world data engineering, finding all $s$-biplexes is neither necessary nor computationally affordable. A more realistic problem is to identify some of the largest $s$-biplexes from the large input graph. We formulate the problem as the {em top-$k$ $s$-biplex search (TBS) problem}, which aims to find the top-$k$ maximal $s$-biplexes with the most vertices, where $k$ is an input parameter. We prove that the TBS problem is NP-hard for any fixed $kge 1$. Then, we propose a branching algorithm, named MVBP, that breaks the simple $2^n$ enumeration algorithm. Furthermore, from a practical perspective, we investigate three techniques to improve the performance of MVBP: 2-hop decomposition, single-side bounds, and progressive search. Complexity analysis shows that the improved algorithm, named FastMVBP, has a running time $O^*(gamma_s^{d_2})$, where $gamma_s<2$, and $d_2$ is a parameter much smaller than the number of vertex in the sparse real-world graphs, e.g. $d_2$ is only $67$ in the AmazonRatings dataset which has more than $3$ million vertices. Finally, we conducted extensive experiments on eight real-world and synthetic datasets to demonstrate the empirical efficiency of the proposed algorithms. In particular, FastMVBP outperforms the benchmark algorithms by up to three orders of magnitude in several instances.

Problem

Research questions and friction points this paper is trying to address.

Efficiently find top-k maximal s-biplexes in large bipartite graphs

Prove TBS problem is NP-hard and propose MVBP algorithm

Improve MVBP with 2-hop decomposition and progressive search

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes MVBP branching algorithm for TBS problem

Uses 2-hop decomposition to enhance performance

Implements FastMVBP with progressive search optimization

🔎 Similar Papers

No similar papers found.