Fairness Definitions in Language Models Explained

📅 2024-07-26

🏛️ arXiv.org

📈 Citations: 11

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Large language models (LLMs) often inherit and amplify societal biases—such as gender and racial biases—while existing fairness definitions suffer from conceptual ambiguity, ill-defined boundaries, and unclear applicability, hindering rigorous fairness evaluation and governance. Method: We propose the first taxonomy of fairness concepts specifically designed for LLMs, systematically distinguishing over 12 mainstream fairness definitions based on their theoretical foundations and operational mechanisms. Through empirical experiments across model scales—including bias measurement and cross-definition comparative analysis—we evaluate context-dependent applicability. Contribution/Results: Our work clarifies logical boundaries and practical efficacy of fairness definitions, establishes a unified terminology framework, and releases open-source, reproducible code and pedagogical resources. This advances standardization, comparability, and methodological rigor in LLM fairness research.

Technology Category

Application Category

📝 Abstract

Language Models (LMs) have demonstrated exceptional performance across various Natural Language Processing (NLP) tasks. Despite these advancements, LMs can inherit and amplify societal biases related to sensitive attributes such as gender and race, limiting their adoption in real-world applications. Therefore, fairness has been extensively explored in LMs, leading to the proposal of various fairness notions. However, the lack of clear agreement on which fairness definition to apply in specific contexts ( extit{e.g.,} medium-sized LMs versus large-sized LMs) and the complexity of understanding the distinctions between these definitions can create confusion and impede further progress. To this end, this paper proposes a systematic survey that clarifies the definitions of fairness as they apply to LMs. Specifically, we begin with a brief introduction to LMs and fairness in LMs, followed by a comprehensive, up-to-date overview of existing fairness notions in LMs and the introduction of a novel taxonomy that categorizes these concepts based on their foundational principles and operational distinctions. We further illustrate each definition through experiments, showcasing their practical implications and outcomes. Finally, we discuss current research challenges and open questions, aiming to foster innovative ideas and advance the field. The implementation and additional resources are publicly available at https://github.com/LavinWong/Fairness-in-Large-Language-Models/tree/main/definitions.

Problem

Research questions and friction points this paper is trying to address.

Clarifying fairness definitions in Language Models (LMs)

Addressing societal biases in LMs related to gender and race

Proposing a taxonomy for fairness notions in transformer-based LMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic survey clarifying LM fairness definitions

Novel taxonomy categorizing fairness by transformer architecture

Experimental illustration of fairness definitions' practical outcomes

🔎 Similar Papers

Collapsed Language Models Promote Fairness