🤖 AI Summary
This study addresses the limited representation of Arabic cultural knowledge in large language models (LLMs). To mitigate this, we propose a multi-source cultural data fusion strategy for dataset augmentation, constructing a high-quality, culturally grounded multiple-choice dataset comprising over 22,000 items. Leveraging the Fanar-1-9B-Instruct base model, we apply efficient parameter-efficient fine-tuning via LoRA. Crucially, we introduce PalmX and Palm datasets to enhance cross-domain generalization. Experimental results demonstrate that our approach achieves 84.1% accuracy on the PalmX development set and ranks fifth on the blind test set with 70.50% accuracy—marking a substantial improvement in Arabic cultural understanding and reasoning. The work validates the efficacy and scalability of synergistically combining culturally enriched data augmentation with lightweight fine-tuning for optimizing cultural knowledge representation in LLMs.
📝 Abstract
In this paper, we report our participation to the PalmX cultural evaluation shared task. Our system, CultranAI, focused on data augmentation and LoRA fine-tuning of large language models (LLMs) for Arabic cultural knowledge representation. We benchmarked several LLMs to identify the best-performing model for the task. In addition to utilizing the PalmX dataset, we augmented it by incorporating the Palm dataset and curated a new dataset of over 22K culturally grounded multiple-choice questions (MCQs). Our experiments showed that the Fanar-1-9B-Instruct model achieved the highest performance. We fine-tuned this model on the combined augmented dataset of 22K+ MCQs. On the blind test set, our submitted system ranked 5th with an accuracy of 70.50%, while on the PalmX development set, it achieved an accuracy of 84.1%.