Theory Discovery in Social Networks: Automating ERGM Specification with Large Language Models

📅 2026-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses a critical bottleneck in the empirical validation of qualitative social theories: the reliance on expert knowledge to translate theoretical constructs into estimable Exponential Random Graph Model (ERGM) specifications. To overcome this challenge, the authors propose Forge, a novel framework that integrates large language models with statistical safeguards to automatically generate and iteratively refine ERGM candidate specifications from natural language descriptions of social contexts and observed network data. By incorporating stability constraints and goodness-of-fit diagnostics, Forge ensures both model feasibility and adequacy. Evaluated on 12 benchmark social networks, Forge achieved convergence in 10 cases, with 9 attaining optimal likelihood fit and meeting sufficiency thresholds. This approach substantially reduces the manual effort required for ERGM specification, establishing an automated bridge from abstract social theory to testable statistical models.

Technology Category

Application Category

📝 Abstract
Understanding how social networks form, whether through reciprocity, shared attributes, or triadic closure, is central to computational social science. Exponential Random Graph Models (ERGMs) offer a principled framework for testing such formation theories, but translating qualitative social hypotheses into stable statistical specifications remains a significant barrier, requiring expertise in both network theory and model estimation. We present Forge (Formation-Oriented Reasoning with Guarded ERGMs), a framework that uses large language models to automate this translation. Given a network and an informal description of the social context, Forge proposes candidate formation mechanisms, validates them against feasibility and stability constraints, and iteratively refines specifications using goodness-of-fit diagnostics. Evaluation across twelve benchmark networks spanning schools, organizations, and online communication shows that Forge converges in 10 of 12 cases, and conditional on convergence it achieves the best likelihood-based fit in 9 of 10 while meeting adequacy thresholds. By combining LLM-based proposals with statistical guardrails, Forge reduces the manual effort required for ERGM specification.
Problem

Research questions and friction points this paper is trying to address.

Theory Discovery
Social Networks
Exponential Random Graph Models
ERGM Specification
Statistical Modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exponential Random Graph Models
Large Language Models
Automated Model Specification
Social Network Formation
Statistical Guardrails
🔎 Similar Papers
No similar papers found.