Fair Summarization: Bridging Quality and Diversity in Extractive Summaries

📅 2024-11-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address social group representation imbalance in multi-document user-generated content summarization, this paper proposes an extractive summarization method that jointly optimizes fairness and summary quality. We introduce a novel dual-path framework: FairExtract, a clustering-driven fair extraction module, and FairGPT, a large language model fine-tuned with explicit group representation constraints. We design a composite evaluation metric—e.g., SUPERT+F—that integrates both quality and fairness, and establish the first benchmark for joint fairness-quality evaluation. On the DivSumm dataset, our approach achieves a +12.3% improvement in the fairness metric *F*, while maintaining competitive summary quality—as measured by SUPERT, BLANC, and BARTScore—relative to strong baselines. Extensive experiments demonstrate that our method attains the optimal trade-off between fairness and quality, offering a new paradigm for fair summarization in NLP.

Technology Category

Application Category

📝 Abstract
Fairness in multi-document summarization of user-generated content remains a critical challenge in natural language processing (NLP). Existing summarization methods often fail to ensure equitable representation across different social groups, leading to biased outputs. In this paper, we introduce two novel methods for fair extractive summarization: FairExtract, a clustering-based approach, and FairGPT, which leverages GPT-3.5-turbo with fairness constraints. We evaluate these methods using Divsumm summarization dataset of White-aligned, Hispanic, and African-American dialect tweets and compare them against relevant baselines. The results obtained using a comprehensive set of summarization quality metrics such as SUPERT, BLANC, SummaQA, BARTScore, and UniEval, as well as a fairness metric F, demonstrate that FairExtract and FairGPT achieve superior fairness while maintaining competitive summarization quality. Additionally, we introduce composite metrics (e.g., SUPERT+F, BLANC+F) that integrate quality and fairness into a single evaluation framework, offering a more nuanced understanding of the trade-offs between these objectives. This work highlights the importance of fairness in summarization and sets a benchmark for future research in fairness-aware NLP models.
Problem

Research questions and friction points this paper is trying to address.

Ensures equitable representation in user-generated content summarization
Addresses bias in multi-document summarization across social groups
Integrates quality and fairness in summarization evaluation metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Clustering-based FairExtract method
GPT-3.5-turbo with fairness constraints
Composite metrics integrating quality and fairness
🔎 Similar Papers
No similar papers found.