🤖 AI Summary
Large language models (LLMs) often inherit and amplify societal biases—such as gender and racial biases—while existing fairness definitions suffer from conceptual ambiguity, ill-defined boundaries, and unclear applicability, hindering rigorous fairness evaluation and governance.
Method: We propose the first taxonomy of fairness concepts specifically designed for LLMs, systematically distinguishing over 12 mainstream fairness definitions based on their theoretical foundations and operational mechanisms. Through empirical experiments across model scales—including bias measurement and cross-definition comparative analysis—we evaluate context-dependent applicability.
Contribution/Results: Our work clarifies logical boundaries and practical efficacy of fairness definitions, establishes a unified terminology framework, and releases open-source, reproducible code and pedagogical resources. This advances standardization, comparability, and methodological rigor in LLM fairness research.
📝 Abstract
Language Models (LMs) have demonstrated exceptional performance across various Natural Language Processing (NLP) tasks. Despite these advancements, LMs can inherit and amplify societal biases related to sensitive attributes such as gender and race, limiting their adoption in real-world applications. Therefore, fairness has been extensively explored in LMs, leading to the proposal of various fairness notions. However, the lack of clear agreement on which fairness definition to apply in specific contexts ( extit{e.g.,} medium-sized LMs versus large-sized LMs) and the complexity of understanding the distinctions between these definitions can create confusion and impede further progress. To this end, this paper proposes a systematic survey that clarifies the definitions of fairness as they apply to LMs. Specifically, we begin with a brief introduction to LMs and fairness in LMs, followed by a comprehensive, up-to-date overview of existing fairness notions in LMs and the introduction of a novel taxonomy that categorizes these concepts based on their foundational principles and operational distinctions. We further illustrate each definition through experiments, showcasing their practical implications and outcomes. Finally, we discuss current research challenges and open questions, aiming to foster innovative ideas and advance the field. The implementation and additional resources are publicly available at https://github.com/LavinWong/Fairness-in-Large-Language-Models/tree/main/definitions.