Gender Bias in MT for a Genderless Language: New Benchmarks for Basque

📅 2026-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scarcity of gender bias evaluation resources for genderless, low-resource languages by introducing the first targeted benchmark for Basque. The authors propose two novel datasets—WinoMTeus and FLORES+Gender—extending the WinoMT and FLORES benchmarks to capture gender-related translation phenomena in Basque. Through systematic evaluation using multilingual large language models and both open-source and commercial machine translation systems, the work reveals a consistent male bias in translations, with higher translation quality observed for masculine references. These findings demonstrate that gender bias persists significantly in cross-lingual transfer, even in languages without grammatical gender. By accounting for Basque’s linguistic characteristics and sociocultural context, this research fills a critical gap in the assessment of gender bias for genderless languages.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) and machine translation (MT) systems are increasingly used in our daily lives, but their outputs can reproduce gender bias present in the training data. Most resources for evaluating such biases are designed for English and reflect its sociocultural context, which limits their applicability to other languages. This work addresses this gap by introducing two new datasets to evaluate gender bias in translations involving Basque, a low-resource and genderless language. WinoMTeus adapts the WinoMT benchmark to examine how gender-neutral Basque occupations are translated into gendered languages such as Spanish and French. FLORES+Gender, in turn, extends the FLORES+ benchmark to assess whether translation quality varies when translating from gendered languages (Spanish and English) into Basque depending on the gender of the referent. We evaluate several general-purpose LLMs and open and proprietary MT systems. The results reveal a systematic preference for masculine forms and, in some models, a slightly higher quality for masculine referents. Overall, these findings show that gender bias is still deeply rooted in these models, and highlight the need to develop evaluation methods that consider both linguistic features and cultural context.
Problem

Research questions and friction points this paper is trying to address.

gender bias
machine translation
Basque
evaluation benchmark
genderless language
Innovation

Methods, ideas, or system contributions that make the work stand out.

gender bias
machine translation
Basque
evaluation benchmark
genderless language
🔎 Similar Papers
No similar papers found.
A
Amaia Murillo
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
O
Olatz-Perez-de-Viñaspre
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
Naiara Perez
Naiara Perez
NLP Researcher at IXA, HiTZ Center, University of the Basque Country
Natural Language Processing