Culturally-Grounded Chain-of-Thought (CG-CoT):Enhancing LLM Performance on Culturally-Specific Tasks in Low-Resource Languages

📅 2025-06-01

📈 Citations: 0

✨ Influential: 0

career value

140K/year

🤖 AI Summary

Large language models (LLMs) exhibit poor performance on culture-specific reasoning tasks in low-resource languages—e.g., Yoruba proverb comprehension—hindering their equitable global deployment. To address this, we propose Cultural Chain-of-Thought (C-CoT), a novel prompting framework that integrates dense vector retrieval for localized cultural context acquisition, explicit chain-of-thought reasoning to guide culturally grounded inference, and a dual verification mechanism combining LLM self-assessment with human-in-the-loop validation. Experiments demonstrate substantial gains in cultural alignment accuracy and reasoning depth. Crucially, we uncover a fundamental misalignment between conventional translation metrics (e.g., BLEU) and cultural relevance evaluation, thereby advocating a paradigm shift in low-resource NLP assessment. To our knowledge, this is the first work to systematically unify cultural retrieval, structured reasoning, and multi-tiered validation to enhance LLMs’ cultural intelligence.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) struggle with culturally-specific reasoning tasks, particularly in low-resource languages, hindering their global applicability. Addressing this gap is crucial for equitable AI deployment. We introduce Culturally-Grounded Chain-of-Thought (CG-CoT), a novel prompting strategy that combines dense vector retrieval of cultural context with explicit reasoning sequences. Our extensive experiments on Yoruba proverb interpretation demonstrate that CG-CoT provides significantly higher culturally-aligned accuracy and depth than traditional prompting methods, validated through both automated metrics and LLM-based evaluations. Notably, we uncover stark disparities between token-level translation metrics like BLEU and human-judged cultural relevance, suggesting a rethinking of evaluation approaches for low-resource NLP.

Problem

Research questions and friction points this paper is trying to address.

Improving LLM performance on culturally-specific tasks in low-resource languages

Addressing disparities in cultural relevance evaluation metrics for NLP

Enhancing reasoning accuracy with culturally-grounded prompting strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dense vector retrieval of cultural context

Explicit reasoning sequences in prompts

Culturally-aligned accuracy enhancement

🔎 Similar Papers

Self-Alignment: Improving Alignment of Cultural Values in LLMs via In-Context Learning