German General Personas: A Survey-Derived Persona Prompt Collection for Population-Aligned LLM Studies

πŸ“… 2025-11-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In computational social science, LLM-based role prompting typically relies on manually crafted personas, lacking reproducible, nationally representative persona resources grounded in survey dataβ€”limiting simulation fidelity and demographic alignment. To address this, we introduce GGP, the first German-language persona prompt library derived from the nationally representative ALLBUS survey in Germany. GGP employs statistically driven attribute selection and structured prompt engineering to map demographic variables onto LLM-interpretable persona descriptions. Compatible with diverse large language models, GGP significantly outperforms conventional classifiers in low-data regimes. Empirical evaluation demonstrates that GGP-guided models more accurately reproduce observed survey response distributions, thereby enhancing demographic representativeness and behavioral plausibility in social simulations. (132 words)

Technology Category

Application Category

πŸ“ Abstract
The use of Large Language Models (LLMs) for simulating human perspectives via persona prompting is gaining traction in computational social science. However, well-curated, empirically grounded persona collections remain scarce, limiting the accuracy and representativeness of such simulations. Here we introduce the German General Personas (GGP) collection, a comprehensive and representative persona prompt collection built from the German General Social Survey (ALLBUS). The GGP and its persona prompts are designed to be easily plugged into prompts for all types of LLMs and tasks, steering models to generate responses aligned with the underlying German population. We evaluate GGP by prompting various LLMs to simulate survey response distributions across diverse topics, demonstrating that GGP-guided LLMs outperform state-of-the-art classifiers, particularly under data scarcity. Furthermore, we analyze how the representativity and attribute selection within persona prompts affect alignment with population responses. Our findings suggest that GGP provides a potentially valuable resource for research on LLM-based social simulations that enables more systematic explorations of population-aligned persona prompting in NLP and social science research.
Problem

Research questions and friction points this paper is trying to address.

Addresses scarcity of empirically grounded persona collections for LLM simulations
Introduces German General Personas for population-aligned LLM studies
Evaluates persona prompts to improve representativeness in social simulations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collection built from German General Social Survey
Persona prompts pluggable into various LLM tasks
Evaluation shows outperforming classifiers under data scarcity
πŸ”Ž Similar Papers
No similar papers found.