Scaling Novel Graph Generation via Lightweight Structure-Guided Autoregressive Models

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing graph generation models struggle to balance scalability and novelty, often failing to efficiently produce realistic and diverse graph structures. This work proposes a lightweight autoregressive framework that serializes graphs into edge sequences via structure-guided topological ordering and employs a two-stage exploration–refinement training strategy to reduce computational complexity while enhancing generalization and controllable novelty. The approach is compatible with sequential architectures such as LSTM and Mamba and incorporates a large-memory acceleration technique to overcome GPU memory constraints. Experimental results demonstrate that, on both molecular and non-molecular benchmarks, the generated graphs achieve high validity and uniqueness while significantly improving novelty and diversity.

📝 Abstract

Generating realistic and diverse graphs is a key problem in machine learning, with applications in molecular discovery, circuit design, cybersecurity, and beyond. However, current graph generative models remain limited by scalability and novelty. Diffusion-based methods often require costly full-adjacency operations and long denoising chains, while many autoregressive and hybrid models have at least quadratic complexity. In addition, these models often imitate training graphs rather than generalize beyond them. We propose a lightweight autoregressive framework to address these issues. It uses a structure-guided topological ordering to serialize graphs into regular edge sequences, enabling near log-linear generation, and a two-phase training strategy that combines exploration-oriented augmentation with iterative refinement to reduce overfitting and promote controlled novelty. Experiments on molecular and non-molecular benchmarks show that our approach improves novelty while preserving high validity and uniqueness. The framework also supports both LSTM and Mamba-style causal sequence backbones, with large-memory accelerators enabling longer graph-sequence experiments beyond typical GPU limits.

Problem

Research questions and friction points this paper is trying to address.

graph generation

scalability

novelty

autoregressive models

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

structure-guided autoregressive

lightweight graph generation

topological ordering