Social Biases in Knowledge Representations of Wikidata separates Global North from Global South

📅 2025-05-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work uncovers implicit social biases—particularly along gender and age dimensions—in knowledge graph (Wikidata) link prediction tasks, with systematic disparities observed in occupational classification. Crucially, the bias exhibits strong geographic structure: its intensity correlates significantly with the Human Development Index (HDI), cleanly separating “Global North” and “Global South” across 21 regions. To address this, the authors introduce AuditLP, the first fairness auditing framework tailored for link prediction, integrating fairness metrics, triplet-level statistical analysis, and cross-regional model evaluation. Empirically, the study demonstrates—for the first time—that bias in knowledge graphs is not uniformly distributed but is instead embedded within global socio-economic and cultural fault lines. The proposed framework establishes a geographically aware, interpretable, and reproducible evaluation paradigm for KG fairness research.

Technology Category

Application Category

📝 Abstract
Knowledge Graphs have become increasingly popular due to their wide usage in various downstream applications, including information retrieval, chatbot development, language model construction, and many others. Link prediction (LP) is a crucial downstream task for knowledge graphs, as it helps to address the problem of the incompleteness of the knowledge graphs. However, previous research has shown that knowledge graphs, often created in a (semi) automatic manner, are not free from social biases. These biases can have harmful effects on downstream applications, especially by leading to unfair behavior toward minority groups. To understand this issue in detail, we develop a framework -- AuditLP -- deploying fairness metrics to identify biased outcomes in LP, specifically how occupations are classified as either male or female-dominated based on gender as a sensitive attribute. We have experimented with the sensitive attribute of age and observed that occupations are categorized as young-biased, old-biased, and age-neutral. We conduct our experiments on a large number of knowledge triples that belong to 21 different geographies extracted from the open-sourced knowledge graph, Wikidata. Our study shows that the variance in the biased outcomes across geographies neatly mirrors the socio-economic and cultural division of the world, resulting in a transparent partition of the Global North from the Global South.
Problem

Research questions and friction points this paper is trying to address.

Detects social biases in Wikidata knowledge graph representations
Analyzes gender and age biases in occupation classifications
Reveals Global North-South divisions in biased knowledge outcomes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed AuditLP framework for bias detection
Used fairness metrics in link prediction tasks
Analyzed Wikidata triples across 21 geographies
🔎 Similar Papers
No similar papers found.