Can Uniform Meaning Representation Help GPT-4 Translate from Indigenous Languages?

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
GPT-4 exhibits limited translation performance on extremely low-resource Indigenous languages (e.g., Navajo, Arapaho, Kukama). To address this, we propose Uniform Meaning Representation (UMR) as a zero-shot semantic bridge—directly integrated into GPT-4’s prompts without fine-tuning. Our method leverages UMR’s formal semantic annotations and designs multilingual contrastive prompts to enhance cross-lingual semantic alignment. Experiments across multiple Indigenous language pairs demonstrate statistically significant improvements in BLEU and chrF scores for most test cases. This work establishes the efficacy of structured semantic representations in adapting large language models to low-resource settings, offering a lightweight, scalable, and training-free paradigm for augmenting language technologies.

Technology Category

Application Category

📝 Abstract
While ChatGPT and GPT-based models are able to effectively perform many tasks without additional fine-tuning, they struggle with related to extremely low-resource languages and indigenous languages. Uniform Meaning Representation (UMR), a semantic representation designed to capture the meaning of texts in many languages, is well-poised to be leveraged in the development of low-resource language technologies. In this work, we explore the downstream technical utility of UMR for low-resource languages by incorporating it into GPT-4 prompts. Specifically, we examine the ability of GPT-4 to perform translation from three indigenous languages (Navajo, Ar'apaho, and Kukama), with and without demonstrations, as well as with and without UMR annotations. Ultimately we find that in the majority of our test cases, integrating UMR into the prompt results in a statistically significant increase in performance, which is a promising indication of future applications of the UMR formalism.
Problem

Research questions and friction points this paper is trying to address.

Improving GPT-4 translation of indigenous languages
Utilizing Uniform Meaning Representation for low-resource languages
Enhancing language technology with UMR annotations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes Uniform Meaning Representation
Enhances GPT-4 translation accuracy
Focuses on indigenous languages