🤖 AI Summary
Existing network generation methods often suffer from overfitting, neglect of critical structural features, and high computational costs, hindering the efficient synthesis of high-fidelity networks. To address these limitations, this work proposes SyNGLER, a framework that learns node embeddings in a low-dimensional latent space and employs a distribution-agnostic generator to resample and reconstruct networks within this space. The approach effectively preserves essential topological properties—such as sparsity and degree heterogeneity—and provides theoretical guarantees on edge distribution consistency. Experimental results demonstrate that SyNGLER significantly reduces computational overhead while more accurately reproducing the degree distributions and higher-order moments of real-world networks, outperforming current deep generative models.
📝 Abstract
Network data are ubiquitous across the social sciences, biology, and information systems. Generating realistic synthetic network data has broad applications from network simulation to scientific discovery. However, many existing black-box approaches for network generation tend to overfit observed data while overlooking characteristic network structure, and incur substantial computational overhead at scale. These practical challenges call for synthetic network generation methods that are both efficient and capable of capturing structural properties of networks. In this paper, we introduce Synthetic Network Generation via Latent Embedding Reconstruction (SyNGLER), a general and efficient framework for synthetic network generation that builds on latent space network models. Given an observed network, SyNGLER first learns low-dimensional latent node embeddings via a latent space network model and then reconstructs the latent space by building a distribution-free generator over these embeddings. For generation, SyNGLER first samples (or resamples) node embeddings from the generator in the latent space and then produces synthetic networks using the latent space network model. Through the latent space framework, SyNGLER preserves unique characteristics in networks such as sparsity and node degree heterogeneity, while allowing for efficient training with lower computational cost than many existing deep architectures. We provide theoretical guarantees by developing consistency results on the distance between the true and synthetic edge distributions. Empirical studies further demonstrate the effectiveness of SyNGLER, which efficiently produces networks that better preserve key network characteristics such as network moments and degree distributions compared with existing approaches. Code is available at https://github.com/FeifanJiang/syngler.