Exact Recovery in the Data Block Model

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the problem of exact community recovery in networks with node-side information, such as attributes or labels, under the Data Block Model (DBM). By introducing the Chernoff–TV divergence, the work establishes—for the first time—a sharp information-theoretic threshold for exact recovery in the DBM, proving that recovery is fundamentally impossible below this threshold. Building on this theoretical foundation, the authors develop an efficient community detection algorithm that achieves this optimal limit. Extensive simulations demonstrate that incorporating node-side data significantly enhances community detection performance, thereby validating the theoretical predictions and highlighting the practical benefits of leveraging auxiliary node information in network inference tasks.

Technology Category

Application Category

📝 Abstract
Community detection in networks is a fundamental problem in machine learning and statistical inference, with applications in social networks, biological systems, and communication networks. The stochastic block model (SBM) serves as a canonical framework for studying community structure, and exact recovery, identifying the true communities with high probability, is a central theoretical question. While classical results characterize the phase transition for exact recovery based solely on graph connectivity, many real-world networks contain additional data, such as node attributes or labels. In this work, we study exact recovery in the Data Block Model (DBM), an SBM augmented with node-associated data, as formalized by Asadi, Abbe, and Verd\'{u} (2017). We introduce the Chernoff--TV divergence and use it to characterize a sharp exact recovery threshold for the DBM. We further provide an efficient algorithm that achieves this threshold, along with a matching converse result showing impossibility below the threshold. Finally, simulations validate our findings and demonstrate the benefits of incorporating vertex data as side information in community detection.
Problem

Research questions and friction points this paper is trying to address.

exact recovery
community detection
stochastic block model
node attributes
Data Block Model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data Block Model
Exact Recovery
Chernoff–TV divergence
Community Detection
Side Information
🔎 Similar Papers
No similar papers found.
A
Amir R. Asadi
Statistical Laboratory, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
A
A. Davoodi
R
R. Javadi
Department of Mathematical Sciences, Isfahan University of Technology, Isfahan, Iran
Farzad Parvaresh
Farzad Parvaresh
University of Isfahan
Information TheoryCodingMachine Learning