Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

171K/year
🤖 AI Summary
This work addresses the poor generalization and susceptibility to overfitting of large language models in extremely low-resource machine translation. To mitigate these issues, the authors propose a reinforcement learning–based in-context learning approach that, for the first time, integrates outcome-based reinforcement learning into language in-context learning. Using chrF as the reward signal, the method guides the model to extract and apply linguistic knowledge from provided context, thereby acquiring meta-skills for cross-lingual transfer rather than memorizing language-specific patterns. Experimental results demonstrate that this approach significantly outperforms conventional in-context learning and supervised fine-tuning on entirely unseen languages, confirming the effectiveness of reinforcement learning in enhancing the model’s linguistic generalization capabilities.
📝 Abstract
Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire the meta-skill of utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose a reinforcement learning (RL) approach to unseen language translation given rich linguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages than in-context learning or supervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.
Problem

Research questions and friction points this paper is trying to address.

unseen language translation
low-resource languages
in-context learning
zero-shot transfer
linguistic context
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning
unseen language translation
in-context learning
low-resource languages
meta-skill acquisition
🔎 Similar Papers
No similar papers found.