🤖 AI Summary
Existing graph generation methods produce high-quality graphs but lack interpretability, failing to reveal the semantic rationale underlying structural decisions. To address this, we introduce, for the first time, topic modeling into graph generation and propose the Substructure-based Neural Graph Topic Model (SGTM). SGTM models a graph as a latent mixture of semantically rich substructures—such as motifs or functional modules—and jointly represents topic distributions and global topology via latent variables, enabling local-to-global semantic traceability throughout generation. The model supports fine-grained structural control and incorporation of domain-specific biological priors. While matching state-of-the-art generation quality, SGTM significantly enhances interpretability: users can explicitly manipulate graph structural properties by adjusting topic weights. Extensive experiments on multiple benchmark datasets validate its effectiveness, controllability, and superior interpretability.
📝 Abstract
Graph generation plays a pivotal role across numerous domains, including molecular design and knowledge graph construction. Although existing methods achieve considerable success in generating realistic graphs, their interpretability remains limited, often obscuring the rationale behind structural decisions. To address this challenge, we propose the Neural Graph Topic Model (NGTM), a novel generative framework inspired by topic modeling in natural language processing. NGTM represents graphs as mixtures of latent topics, each defining a distribution over semantically meaningful substructures, which facilitates explicit interpretability at both local and global scales. The generation process transparently integrates these topic distributions with a global structural variable, enabling clear semantic tracing of each generated graph. Experiments demonstrate that NGTM achieves competitive generation quality while uniquely enabling fine-grained control and interpretability, allowing users to tune structural features or induce biological properties through topic-level adjustments.