🤖 AI Summary
Large language models (LLMs) frequently exhibit gender misattribution—assigning incorrect gendered pronouns or titles—across multilingual and multicultural contexts, reflecting and amplifying sociolinguistic biases.
Method: We introduce the first culture-first, multilingual framework for evaluating and mitigating gender misattribution, covering 42 languages and dialects. Leveraging participatory design and deep collaboration with local communities, we integrate human-in-the-loop annotation, cross-linguistic grammatical analysis, and sociolinguistic expertise to develop a language- and culture-sensitive referential protection mechanism, deployed within conference transcript summarization.
Contribution/Results: Our approach significantly reduces gender misattribution rates across all 42 languages without compromising summary quality (measured by ROUGE-L and human evaluation). It moves beyond Anglo-centric paradigms, demonstrating the scalability and practical efficacy of inclusive, cross-lingual AI governance grounded in cultural specificity and linguistic diversity.
📝 Abstract
Misgendering is the act of referring to someone by a gender that does not match their chosen identity. It marginalizes and undermines a person's sense of self, causing significant harm. English-based approaches have clear-cut approaches to avoiding misgendering, such as the use of the pronoun ``they''. However, other languages pose unique challenges due to both grammatical and cultural constructs. In this work we develop methodologies to assess and mitigate misgendering across 42 languages and dialects using a participatory-design approach to design effective and appropriate guardrails across all languages. We test these guardrails in a standard large language model-based application (meeting transcript summarization), where both the data generation and the annotation steps followed a human-in-the-loop approach. We find that the proposed guardrails are very effective in reducing misgendering rates across all languages in the summaries generated, and without incurring loss of quality. Our human-in-the-loop approach demonstrates a method to feasibly scale inclusive and responsible AI-based solutions across multiple languages and cultures.