Relating Word Embedding Gender Biases to Gender Gaps: A Cross-Cultural Analysis

📅 2026-01-23
🏛️ Proceedings of the First Workshop on Gender Bias in Natural Language Processing
📈 Citations: 23
Influential: 2
📄 PDF
🤖 AI Summary
This study quantifies gender bias in word embeddings and investigates its association with real-world gender disparities across societies. Leveraging Twitter data from 51 U.S. regions and 99 countries in 2018, the authors construct a metric of gender bias in word embeddings and systematically correlate it with 18 international and 5 U.S.-specific gender gap indicators spanning education, politics, economics, and health. The analysis reveals significant cross-cultural correlations and strong predictive power, demonstrating for the first time that gender bias embedded in language models serves as a valid proxy for societal gender inequality. These findings establish a novel paradigm for monitoring social biases through computational linguistic methods at a global scale.

Technology Category

Application Category

📝 Abstract
Modern models for common NLP tasks often employ machine learning techniques and train on journalistic, social media, or other culturally-derived text. These have recently been scrutinized for racial and gender biases, rooting from inherent bias in their training text. These biases are often sub-optimal and recent work poses methods to rectify them; however, these biases may shed light on actual racial or gender gaps in the culture(s) that produced the training text, thereby helping us understand cultural context through big data. This paper presents an approach for quantifying gender bias in word embeddings, and then using them to characterize statistical gender gaps in education, politics, economics, and health. We validate these metrics on 2018 Twitter data spanning 51 U.S. regions and 99 countries. We correlate state and country word embedding biases with 18 international and 5 U.S.-based statistical gender gaps, characterizing regularities and predictive strength.
Problem

Research questions and friction points this paper is trying to address.

gender bias
word embeddings
gender gaps
cross-cultural analysis
social inequality
Innovation

Methods, ideas, or system contributions that make the work stand out.

word embeddings
gender bias
cross-cultural analysis
gender gap prediction
computational social science
🔎 Similar Papers
No similar papers found.
S
Scott E. Friedman
SIFT, Minneapolis, MN USA
Sonja Schmer-Galunder
Sonja Schmer-Galunder
University of Florida
AI EthicsSocial Science
A
Anthony Chen
SIFT, Minneapolis, MN USA
J
J. Rye
SIFT, Minneapolis, MN USA