Optimal community detection in dense bipartite graphs

πŸ“… 2025-05-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper addresses the exact detection of dense subcommunities in high-dimensional bipartite graphs. Under the null hypothesis, the graph is a bipartite ErdΕ‘s–RΓ©nyi random graph with edge probability (p_0); under the alternative, an unknown (k_1 imes k_2) subgraph exhibits elevated edge probability (p_1 = p_0 + delta) ((delta > 0)). The goal is to characterize the minimal detectable signal strength (delta^*) β€” i.e., the sharp non-asymptotic upper and lower bounds on (delta) required for both type-I and type-II error probabilities to vanish. Methodologically, we propose a novel nonlinear statistic based on hard thresholding of the adjacency matrix, coupled with combinatorial moment estimation and refined probabilistic analysis. Our contribution is the first derivation of matching non-asymptotic bounds on (delta^*) for arbitrary dimensions (n_1, n_2, k_1, k_2), along with an explicitly constructed minimax-optimal test. The resulting detector achieves theoretically optimal detection performance in the dense-graph regime.

Technology Category

Application Category

πŸ“ Abstract
We consider the problem of detecting a community of densely connected vertices in a high-dimensional bipartite graph of size $n_1 imes n_2$. Under the null hypothesis, the observed graph is drawn from a bipartite ErdH{o}s-Renyi distribution with connection probability $p_0$. Under the alternative hypothesis, there exists an unknown bipartite subgraph of size $k_1 imes k_2$ in which edges appear with probability $p_1 = p_0 + delta$ for some $delta>0$, while all other edges outside the subgraph appear with probability $p_0$. Specifically, we provide non-asymptotic upper and lower bounds on the smallest signal strength $delta^*$ that is both necessary and sufficient to ensure the existence of a test with small enough type one and type two errors. We also derive novel minimax-optimal tests achieving these fundamental limits when the underlying graph is sufficiently dense. Our proposed tests involve a combination of hard-thresholded nonlinear statistics of the adjacency matrix, the analysis of which may be of independent interest. In contrast with previous work, our non-asymptotic upper and lower bounds match for any configuration of $n_1,n_2, k_1,k_2$.
Problem

Research questions and friction points this paper is trying to address.

Detecting dense communities in bipartite graphs
Determining minimal signal strength for reliable detection
Developing minimax-optimal tests for dense graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-asymptotic bounds for signal strength
Minimax-optimal tests for dense graphs
Hard-thresholded nonlinear adjacency statistics
πŸ”Ž Similar Papers
No similar papers found.