🤖 AI Summary
This study investigates large language models’ (LLMs) adaptability to users’ sociodemographic attributes—specifically age, occupation, and education level—and presents the first systematic comparison of value expression consistency between explicit user profile prompting and implicit multi-turn dialogue history. We propose a value-probing evaluation framework grounded in multi-agent synthetic data generation and the Schwartz Value Survey (VSM 2013), integrating cross-modal behavioral consistency analysis with reasoning capability correlation testing. Results show that while most LLMs adjust value expressions in response to age and education level, consistency between explicit and implicit modalities remains generally weak. Notably, models with stronger reasoning capabilities exhibit more robust sociodemographic adaptation. This work establishes a novel methodology and empirical benchmark for assessing LLM personalization and fairness, advancing both value-aware model evaluation and equitable human-AI interaction design.
📝 Abstract
Effective engagement by large language models (LLMs) requires adapting responses to users' sociodemographic characteristics, such as age, occupation, and education level. While many real-world applications leverage dialogue history for contextualization, existing evaluations of LLMs' behavioral adaptation often focus on single-turn prompts. In this paper, we propose a framework to evaluate LLM adaptation when attributes are introduced either (1) explicitly via user profiles in the prompt or (2) implicitly through multi-turn dialogue history. We assess the consistency of model behavior across these modalities. Using a multi-agent pipeline, we construct a synthetic dataset pairing dialogue histories with distinct user profiles and employ questions from the Value Survey Module (VSM 2013) (Hofstede and Hofstede, 2016) to probe value expression. Our findings indicate that most models adjust their expressed values in response to demographic changes, particularly in age and education level, but consistency varies. Models with stronger reasoning capabilities demonstrate greater alignment, indicating the importance of reasoning in robust sociodemographic adaptation.