CHAI for LLMs: Improving Code-Mixed Translation in Large Language Models through Reinforcement Learning with AI Feedback

📅 2024-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multilingual large language models (LLMs) exhibit significantly limited performance on code-mixed language understanding and translation tasks. To address this, we propose CHAI, a novel framework that pioneers using the LLM itself as an intelligent annotator to construct high-quality preference data, coupled with an end-to-end RLAIF (Reinforcement Learning from AI Feedback) paradigm for closed-loop optimization of code-mixing capability. Our key contributions are: (1) the first application of LLM-as-Judge for evaluating and generating fine-grained preference labels on code-mixed translation quality; (2) the construction of the first large-scale, high-quality code-mixed translation preference dataset; and (3) substantial improvements in cross-lingual generalization via multi-stage instruction tuning and RLAIF-based alignment. Experiments demonstrate that CHAI-enhanced models achieve a 25.66% higher human win rate over open-source SOTA baselines on real-world code-mixed translation tasks, advancing cross-lingual inclusivity in multilingual LMs.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities across various NLP tasks but struggle with code-mixed (or code-switched) language understanding. For example, prior work benchmarking the performance of multilingual LLMs on code-mixed translation tasks has demonstrated that current state-of-the-art multilingual LLMs are ineffective in dealing with code-mixed languages. However, the question of how to improve the capability of multilingual LLMs to handle code-mixed language has not received any attention to date. In this paper, we tackle this research gap by proposing CHAI, a novel general-purpose framework for improving the ability of multilingual LLMs to handle code-mixed languages. CHAI relies on three novel contributions made in this paper. First, we explore the ability of LLMs to provide accurate annotations for code-mixed translation tasks. Second, we leverage this ability of LLMs as annotators to generate preference data for code-mixed translation tasks at scale, which are then used within a reinforcement learning from AI feedback (RLAIF) procedure to improve LLMs' capability on code-mixed tasks. Third, we conduct a rigorous experimental evaluation across various real-world datasets and settings. Our analysis shows that CHAI-powered LLMs outperform state-of-the-art open-source LLMs by 25.66% (in terms of win rate adjudicated by human annotators) in code-mixed translation tasks. This work represents a first step towards developing more inclusive code-mixed LLMs.
Problem

Research questions and friction points this paper is trying to address.

Improving code-mixed translation in LLMs
Leveraging AI feedback for reinforcement learning
Enhancing multilingual LLMs' code-mixed language handling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning with AI Feedback
LLMs as annotators for code-mixed tasks
CHAI framework for multilingual LLMs
🔎 Similar Papers
No similar papers found.