Cross-lingual Self-Consistency for Multilingual Reasoning with Language Models

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This work addresses the limited multilingual reasoning capabilities of large language models on low-resource and unseen languages, as well as the absence of effective methods that operate without labeled or parallel data. The authors propose an unsupervised reinforcement learning framework that enhances reasoning performance by enforcing cross-lingual self-consistency—requiring the model to produce consistent answers to semantically equivalent questions across languages—without relying on gold labels or multilingual alignment data. This approach substantially improves generalization to both unseen languages and out-of-distribution tasks, achieving an average accuracy gain of 21.7% across the ten languages in the MGSM benchmark, with an 18.2% improvement specifically on unseen languages, and up to a 6.2% increase on three out-of-distribution evaluation benchmarks.

📝 Abstract

Despite expanding their multilingual coverage, the advanced reasoning capabilities of LLMs remain largely confined to a few high-resource languages like English. To address this, we propose an unsupervised Reinforcement Learning (RL) approach to enhance multilingual reasoning by enforcing cross-lingual self-consistency: the principle that a model should produce the same final answer for equivalent problems in different languages. Existing methods are limited by the scarcity of multilingual reasoning data and show weak generalization to unseen languages. Our approach requires neither gold answers nor parallel data, and it achieves average gains of up to 21.7% on MGSM across 10 languages. In addition, our method demonstrates strong generalization, with an 18.2% mean improvement on MGSM languages unseen during training, and up to 6.2% gain on 3 out-of-distribution benchmarks. These results show the potential of consistency-based methods to improve the multilingual capabilities of LLMs without requiring supervised data.

Problem

Research questions and friction points this paper is trying to address.

multilingual reasoning

cross-lingual consistency

language models

low-resource languages

reasoning generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-lingual self-consistency

multilingual reasoning

unsupervised reinforcement learning