Detecting weighted hidden cliques

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the hypothesis testing problem of detecting a hidden clique in weighted graphs: given a complete graph on $ n $ vertices, under the null hypothesis all edge weights are i.i.d. from distribution $ P $; under the alternative, there exists a $ k $-vertex subset whose internal edges follow distribution $ Q $, while all other edges follow $ P $. It is the first work to generalize hidden clique detection to general real-valued weighted graphs, unifying settings where $ P $ and $ Q $ are either fully known or partially unknown. Theoretically, it establishes the statistical detectability threshold as $ k = Omega(sqrt{n}) $, derives tight information-theoretic lower bounds for the indistinguishable regime, and obtains asymptotically sharp minimax risk bounds when $ Q $ is not absolutely continuous with respect to $ P $. Algorithmically, it proposes an efficient spectral test based on the adjacency matrix, achieving optimal statistical rate with $ O(n^2) $ time complexity.

Technology Category

Application Category

📝 Abstract
We study a generalization of the classical hidden clique problem to graphs with real-valued edge weights. Formally, we define a hypothesis testing problem. Under the null hypothesis, edges of a complete graph on $n$ vertices are associated with independent and identically distributed edge weights from a distribution $P$. Under the alternate hypothesis, $k$ vertices are chosen at random and the edge weights between them are drawn from a distribution $Q$, while the remaining are sampled from $P$. The goal is to decide, upon observing the edge weights, which of the two hypotheses they were generated from. We investigate the problem under two different scenarios: (1) when $P$ and $Q$ are completely known, and (2) when there is only partial information of $P$ and $Q$. In the first scenario, we obtain statistical limits on $k$ when the two hypotheses are distinguishable, and when they are not. Additionally, in each of the scenarios, we provide bounds on the minimal risk of the hypothesis testing problem when $Q$ is not absolutely continuous with respect to $P$. We also provide computationally efficient spectral tests that can distinguish the two hypotheses as long as $k=Ω(sqrt{n})$ in both the scenarios.
Problem

Research questions and friction points this paper is trying to address.

Detect hidden cliques in weighted graphs
Distinguish null and alternate hypotheses with known/unknown distributions
Determine statistical limits and efficient tests for distinguishability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalized hidden clique problem with real weights
Statistical limits for hypothesis distinguishability
Efficient spectral tests for large cliques
🔎 Similar Papers
No similar papers found.
U
Urmisha Chatterjee
Indian Statistical Institute, Kolkata, India.
K
Karissa Huang
University of California, Berkeley, USA.
R
Ritabrata Karmakar
Indian Statistical Institute, Kolkata, India.
B
B. R. Vinay Kumar
Eindhoven University of Technology, The Netherlands.
G
Gábor Lugosi
Department of Economics and Business, Pompeu Fabra University, Barcelona, Spain; ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain; Barcelona School of Economics.
N
Nandan Malhotra
University of Leiden, The Netherlands.
Anirban Mandal
Anirban Mandal
Interim Director, Network Research and Infrastructure, RENCI - UNC Chapel Hill
Distributed systemshigh performance computingcloud computingscientific workflowsscheduling
M
Maruf Alam Tarafdar
Indian Statistical Institute, Delhi, India.