Towards Fairness for the Right Reasons: Using Saliency Maps to Evaluate Bias Removal in Neural Networks

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the persistent challenge of sensitive attribute bias in computer vision models—bias that existing debiasing methods often mask rather than eliminate. We propose a novel XAI-based evaluation framework grounded in saliency maps, which systematically quantifies the spatial and semantic distance between model decision regions and protected attributes. Crucially, it distinguishes “superficial fairness” (bias concealment) from “causal fairness” (genuine independence from sensitive attributes). Empirical evaluation across multiple benchmark datasets and state-of-the-art debiased models demonstrates that effective debiasing consistently redirects saliency away from sensitive features; moreover, artifact removal techniques exhibit transferable fairness improvements. The proposed metrics provide interpretable, verifiable, quantitative diagnostics for fairness assessment, thereby enhancing the trustworthiness and ethical robustness of AI systems. (149 words)

Technology Category

Application Category

📝 Abstract
The widespread adoption of machine learning systems has raised critical concerns about fairness and bias, making mitigating harmful biases essential for AI development. In this paper, we investigate the relationship between fairness improvement and the removal of harmful biases in neural networks applied to computer vision tasks. First, we introduce a set of novel XAI-based metrics that analyze saliency maps to assess shifts in a model's decision-making process. Then, we demonstrate that successful debiasing methods systematically redirect model focus away from protected attributes. Additionally, we show that techniques originally developed for artifact removal can be effectively repurposed for fairness. These findings underscore the importance of ensuring that models are fair for the right reasons, contributing to the development of more ethical and trustworthy AI systems.
Problem

Research questions and friction points this paper is trying to address.

Investigates fairness improvement and bias removal in neural networks.
Introduces XAI-based metrics using saliency maps to assess decision shifts.
Demonstrates debiasing methods redirect focus from protected attributes.
Innovation

Methods, ideas, or system contributions that make the work stand out.

XAI-based metrics analyze saliency maps
Debiasing redirects focus from protected attributes
Artifact removal techniques repurposed for fairness
🔎 Similar Papers
No similar papers found.
L
Lukasz Sztukiewicz
Institute of Computing Science, Poznan University of Technology
I
Ignacy Stkepka
Institute of Computing Science, Poznan University of Technology
M
Michal Wili'nski
Institute of Computing Science, Poznan University of Technology
Jerzy Stefanowski
Jerzy Stefanowski
Poznan University of Technology
machine learningdata streamsExplainable AIrule learningimbalanced classification