🤖 AI Summary
This study identifies a pervasive lack of cultural awareness in large language models (LLMs) when processing Danish, manifesting as English-centric responses detached from local sociocultural contexts. To disentangle linguistic competence from cultural proficiency, the authors introduce the first native-speaker-driven evaluation paradigm for cultural awareness: a human-in-the-loop assessment framework grounded in 1,038 interactions with 63 linguistically and demographically diverse Danish native speakers. The framework integrates culturally sensitive task design, mixed qualitative–quantitative analysis, and native-speaker-led prompt engineering. Results reveal that machine-translated data severely compromises the validity of cultural training and evaluation; in contrast, natively authored data more than doubles the acceptance rate of culturally adapted responses. The study contributes DaKultur—the first Danish cultural-awareness benchmark dataset annotated exclusively by native speakers—establishing a methodological standard and empirical foundation for advancing culture-aware LLM development beyond English.
📝 Abstract
Large Language Models (LLMs) have seen widespread societal adoption. However, while they are able to interact with users in languages beyond English, they have been shown to lack cultural awareness, providing anglocentric or inappropriate responses for underrepresented language communities. To investigate this gap and disentangle linguistic versus cultural proficiency, we conduct the first cultural evaluation study for the mid-resource language of Danish, in which native speakers prompt different models to solve tasks requiring cultural awareness. Our analysis of the resulting 1,038 interactions from 63 demographically diverse participants highlights open challenges to cultural adaptation: Particularly, how currently employed automatically translated data are insufficient to train or measure cultural adaptation, and how training on native-speaker data can more than double response acceptance rates. We release our study data as DaKultur - the first native Danish cultural awareness dataset.