🤖 AI Summary
This work addresses the problem of exact recovery of two communities in the stochastic block model under a query budget constraint, where information is accessible only through a noisy neighborhood oracle. The authors propose a two-stage adaptive querying strategy that integrates sub-sampled graph construction with a decaying edge probability model. This approach significantly outperforms non-adaptive uniform querying in terms of information-theoretic limits. Theoretical analysis demonstrates that the proposed adaptive method achieves exact community recovery using only $n + o(n)$ queries, whereas conventional non-adaptive methods require $mn$ queries for some constant $m > 1$. To the best of our knowledge, this is the first algorithm to attain exact recovery with sublinear query complexity in this setting.
📝 Abstract
We study exact community recovery in the two-community stochastic block model on $n$ vertices under limited and noisy access to network data. The learner may query a noisy neighborhood oracle that reveals each true neighbor of a queried vertex independently with fixed probability and never returns non-neighbors, subject to a finite query budget. We consider both oracle-only access and a combined model where the learner also observes a single subsampled copy of the underlying graph. For oracle-only access, balanced uniform querying gives a sharp non-adaptive benchmark: when each vertex is queried the same integer number of times, the observations reduce to an SBM with attenuated edge probabilities and the Abbe-Bandeira-Hall exact-recovery threshold applies. We show that this benchmark is not adaptively optimal: a two-stage adaptive strategy succeeds with $n+o(n)$ queries in a regime where balanced uniform querying requires $m n$ queries for some $m>1$. With an additional subsampled graph, we prove a sublinear-query adaptivity gap: balanced data-independent uniform querying with a sublinear budget does not improve over the subsampled graph alone, whereas adaptive querying can target a small set of uncertain vertices and achieve exact recovery. Thus adaptive data acquisition can strictly improve the information-theoretic limits of exact recovery.