🤖 AI Summary
This study investigates implicit preferences and demographic biases in large language models (LLMs) when processing violent content in morally ambiguous real-world scenarios. To address this, we adapt the Violent Behavior Questionnaire (VBVQ)—a validated social science instrument—for the first time in LLM evaluation. Employing standardized zero-shot and persona-based prompting paradigms, we conduct cross-model comparative experiments across six mainstream LLMs representing diverse geopolitical backgrounds. Results reveal a significant discrepancy between surface-level outputs and latent violent inclinations; moreover, model-generated violent responses exhibit systematic variation conditioned on racially, age-, and geographically specified identities in prompts—patterns that contradict empirical criminological and sociological consensus, exposing deep-seated structural biases. Our work contributes a novel, methodologically grounded framework for LLM ethical assessment and provides a reproducible pipeline for bias detection in generative AI systems.
📝 Abstract
Large language models (LLMs) are increasingly proposed for detecting and responding to violent content online, yet their ability to reason about morally ambiguous, real-world scenarios remains underexamined. We present the first study to evaluate LLMs using a validated social science instrument designed to measure human response to everyday conflict, namely the Violent Behavior Vignette Questionnaire (VBVQ). To assess potential bias, we introduce persona-based prompting that varies race, age, and geographic identity within the United States. Six LLMs developed across different geopolitical and organizational contexts are evaluated under a unified zero-shot setting. Our study reveals two key findings: (1) LLMs surface-level text generation often diverges from their internal preference for violent responses; (2) their violent tendencies vary across demographics, frequently contradicting established findings in criminology, social science, and psychology.