🤖 AI Summary
Large language models (LLMs) exhibit cross-cultural value misalignment due to Western-centric training data, limiting their ability to capture cultural nuances and adapt dynamically to novel sociocultural contexts. To address this, we propose a culture-aligned framework that (1) models World Values Survey (WVS) data as individualized value summaries conditioned on demographic attributes for fine-grained cultural retrieval; and (2) integrates semantic re-ranking with retrieval-augmented generation (RAG) to dynamically inject culturally grounded knowledge into in-context learning. We construct a multi-regional cultural evaluation benchmark. Experiments demonstrate statistically significant improvements over role-based prompting and few-shot baselines. Ablation studies confirm that value summaries alone substantially enhance cultural alignment. Our work establishes a scalable, interpretable, and empirically grounded paradigm for improving LLMs’ cultural robustness and value coherence across diverse populations.
📝 Abstract
Cultural values alignment in Large Language Models (LLMs) is a critical challenge due to their tendency to embed Western-centric biases from training data, leading to misrepresentations and fairness issues in cross-cultural contexts. Recent approaches, such as role-assignment and few-shot learning, often struggle with reliable cultural alignment as they heavily rely on pre-trained knowledge, lack scalability, and fail to capture nuanced cultural values effectively. To address these issues, we propose ValuesRAG, a novel and effective framework that applies Retrieval-Augmented Generation (RAG) with in-context learning to integrate cultural and demographic knowledge dynamically during text generation. Leveraging the World Values Survey (WVS) dataset, ValuesRAG first generates summaries of values for each individual. Subsequently, we curated several representative regional datasets to serve as test datasets and retrieve relevant summaries of values based on demographic features, followed by a reranking step to select the top-k relevant summaries. ValuesRAG consistently outperforms baseline methods, both in the main experiment and in the ablation study where only the values summary was provided, highlighting ValuesRAG's potential to foster culturally aligned AI systems and enhance the inclusivity of AI-driven applications.