🤖 AI Summary
This paper studies the hypothesis testing problem of detecting a hidden clique in weighted graphs: given a complete graph on $ n $ vertices, under the null hypothesis all edge weights are i.i.d. from distribution $ P $; under the alternative, there exists a $ k $-vertex subset whose internal edges follow distribution $ Q $, while all other edges follow $ P $. It is the first work to generalize hidden clique detection to general real-valued weighted graphs, unifying settings where $ P $ and $ Q $ are either fully known or partially unknown. Theoretically, it establishes the statistical detectability threshold as $ k = Omega(sqrt{n}) $, derives tight information-theoretic lower bounds for the indistinguishable regime, and obtains asymptotically sharp minimax risk bounds when $ Q $ is not absolutely continuous with respect to $ P $. Algorithmically, it proposes an efficient spectral test based on the adjacency matrix, achieving optimal statistical rate with $ O(n^2) $ time complexity.
📝 Abstract
We study a generalization of the classical hidden clique problem to graphs with real-valued edge weights. Formally, we define a hypothesis testing problem. Under the null hypothesis, edges of a complete graph on $n$ vertices are associated with independent and identically distributed edge weights from a distribution $P$. Under the alternate hypothesis, $k$ vertices are chosen at random and the edge weights between them are drawn from a distribution $Q$, while the remaining are sampled from $P$. The goal is to decide, upon observing the edge weights, which of the two hypotheses they were generated from. We investigate the problem under two different scenarios: (1) when $P$ and $Q$ are completely known, and (2) when there is only partial information of $P$ and $Q$. In the first scenario, we obtain statistical limits on $k$ when the two hypotheses are distinguishable, and when they are not. Additionally, in each of the scenarios, we provide bounds on the minimal risk of the hypothesis testing problem when $Q$ is not absolutely continuous with respect to $P$. We also provide computationally efficient spectral tests that can distinguish the two hypotheses as long as $k=Ω(sqrt{n})$ in both the scenarios.