Towards "Differential AI Psychology" and in-context Value-driven Statement Alignment with Moral Foundations Theory

📅 2024-08-21
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the value alignment capability of generative language models (GLMs) in cross-domain social science research, specifically assessing their accuracy and consistency in representing diverse political ideologies within Moral Foundations Theory (MFT) questionnaires. Method: We introduce “Differential AI Psychology”—a novel framework integrating text-to-text fine-tuning, context-aware personality adaptation, and synthetic population-level statistical analysis to construct a testable paradigm for generating value-aligned behavioral proxies. Contribution/Results: Empirical evaluation reveals that mainstream GLMs significantly deviate from human respondents’ political ideology distributions; model-persona combinations exhibit systematic bias, high intra-group variance, and low inter-subject alignment. This work provides the first quantitative measurement of the psychological credibility gap in AI-based political ideology modeling, demonstrating that contextual optimization or parameter-level interventions are necessary to improve value alignment fidelity. Our findings establish both a theoretical benchmark and an empirically grounded methodology for AI-augmented social science research.

Technology Category

Application Category

📝 Abstract
Contemporary research in social sciences is increasingly utilizing state-of-the-art statistical language models to annotate or generate content. While these models perform benchmark-leading on common language tasks and show exemplary task-independent emergent abilities, transferring them to novel out-of-domain tasks is only insufficiently explored. The implications of the statistical black-box approach - stochastic parrots - are prominently criticized in the language model research community; however, the significance for novel generative tasks is not. This work investigates the alignment between personalized language models and survey participants on a Moral Foundation Theory questionnaire. We adapt text-to-text models to different political personas and survey the questionnaire repetitively to generate a synthetic population of persona and model combinations. Analyzing the intra-group variance and cross-alignment shows significant differences across models and personas. Our findings indicate that adapted models struggle to represent the survey-captured assessment of political ideologies. Thus, using language models to mimic social interactions requires measurable improvements in in-context optimization or parameter manipulation to align with psychological and sociological stereotypes. Without quantifiable alignment, generating politically nuanced content remains unfeasible. To enhance these representations, we propose a testable framework to generate agents based on moral value statements for future research.
Problem

Research questions and friction points this paper is trying to address.

Investigates alignment of LLMs with human moral values
Examines inconsistency in model responses across repetitions
Assesses weak correlation between synthetic and human data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Personalized language models for political personas
Synthetic data generation via repeated model surveys
In-context optimization for ideological alignment
🔎 Similar Papers
No similar papers found.