🤖 AI Summary
This study addresses the limited empirical grounding and creativity of current large language models (LLMs) in social science research, which often rely excessively on literature retrieval and synthesis. To overcome these limitations, the authors propose a memory-augmented social simulation paradigm that constructs a high-fidelity, research-oriented simulation environment. This framework integrates dynamic goal-directed path planning, cold-start initialization using multidisciplinary behavioral data, a structured forgetting mechanism informed by the Ebbinghaus forgetting curve, and multilevel social norm constraints. Experimental results demonstrate that the proposed approach improves generation quality by 6.81% over baseline LLMs and achieves a significant 17.19% gain in insightfulness compared to strong baselines, thereby substantially enhancing the empirical foundation and creative capacity of LLMs in social science inquiry.
📝 Abstract
Deep Research agents powered by Large Language Models (LLMs) have exhibited extraordinary potential in automated paper writing tasks. However, existing systems rely heavily on literature retrieval and synthesis through internet and local knowledge bases, often resulting research in lacking insight and creativity in social science. To address this issue, we propose "Memory-Augmented Social Simulation (MASS)", an innovative paradigm that leverages highly realistic and research-oriented social simulations to enhance the creativity and empirical founding of LLMs-generated research. Specifically, MASS integrates three core components: dynamic goal-path planning with multi-level social norm restraint to guide the simulation, a multi-disciplinary behavior dataset for agent memory cold-start, and a structured forgetting mechanism inspired by the Ebbinghaus curve. Together, these ensure simulation authenticity and provide a robust empirical foundation for generating innovative scholarly papers. Experimental results demonstrate the effectiveness of our method, showing a 6.81\% improvement in generation overall quality over foundation LLMs and 17.19\% gain in Insight over strong baselines.