Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder Models

📅 2024-08-31

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

134K/year

🤖 AI Summary

This study addresses the challenges of machine translation and understanding in dialectal gaming chat. We propose LoRDD, a lightweight, dialect-aware fine-tuning framework for the Target Word Prediction (TWP) task. Methodologically, LoRDD integrates low-rank dialect adapters into decoder-only LLMs (Mistral/Gemma), jointly optimized with task-specific adapters, and introduces pseudo-parallel dialogue contrastive learning alongside masked dialogue modeling. Key contributions include: (1) the first decoder-level low-rank dialect adapter design; (2) a dialect-aware contrastive learning mechanism leveraging pseudo-parallel dialogues; and (3) replacing standard next-token prediction with TWP to improve efficiency and robustness. On Indian English and Nigerian English benchmarks, LoRDD achieves +25.0% and +4.5% absolute TWP accuracy gains, respectively, and reduces lexical similarity gaps with American English to 12.0% and 5.8%, significantly narrowing the performance gap.

Technology Category

Application Category

📝 Abstract

Dialect adapters that improve the performance of LLMs for NLU tasks on certain sociolects/dialects/national varieties ('dialects' for the sake of brevity) have been reported for encoder models. In this paper, we extend the idea of dialect adapters to decoder models in our architecture called LoRDD. Using MD-3, a publicly available dataset of word game-playing conversations between dialectal speakers, our task is Target Word Prediction (TWP) from a masked conversation. LoRDD combines task adapters and dialect adapters where the latter employ contrastive learning on pseudo-parallel conversations from MD-3. Our experiments on Indian English and Nigerian English conversations with two models (Mistral and Gemma) demonstrate that LoRDD outperforms four baselines on TWP. Additionally, it significantly reduces the performance gap with American English, narrowing it to 12% and 5.8% for word similarity, and 25% and 4.5% for accuracy, respectively. The focused contribution of LoRDD is in its promise for dialect adaptation of decoder models using TWP, a simplified version of the commonly used next-word prediction task.

Problem

Research questions and friction points this paper is trying to address.

Dialect Understanding

Machine Translation

Gaming Communication

Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRDD

Dialect Adaptation

Word Prediction

🔎 Similar Papers

Self-playing Adversarial Language Game Enhances LLM Reasoning