🤖 AI Summary
In ethically sensitive domains such as healthcare, machine learning models deployed in real-world settings are prone to bias, undermining fairness and out-of-distribution (OOD) generalization. Existing debiasing methods rely heavily on access to original training data and computationally expensive full-model retraining, compromising both fairness and discriminative performance. To address this, we propose an efficient two-stage soft-mask fine-tuning framework. First, parameter sensitivity analysis quantifies the relative contributions of bias and predictive signals; second, gradient flow modulation—guided by a small external bias-annotated dataset—enables dynamic, lightweight weight adaptation. Crucially, our method operates without requiring the original training data. Evaluated across six benchmark datasets, it significantly mitigates bias along gender, skin tone, and age dimensions, simultaneously improving fairness metrics and OOD generalization while preserving diagnostic accuracy.
📝 Abstract
Recent studies have shown that Machine Learning (ML) models can exhibit bias in real-world scenarios, posing significant challenges in ethically sensitive domains such as healthcare. Such bias can negatively affect model fairness, model generalization abilities and further risks amplifying social discrimination. There is a need to remove biases from trained models. Existing debiasing approaches often necessitate access to original training data and need extensive model retraining; they also typically exhibit trade-offs between model fairness and discriminative performance. To address these challenges, we propose Soft-Mask Weight Fine-Tuning (SWiFT), a debiasing framework that efficiently improves fairness while preserving discriminative performance with much less debiasing costs. Notably, SWiFT requires only a small external dataset and only a few epochs of model fine-tuning. The idea behind SWiFT is to first find the relative, and yet distinct, contributions of model parameters to both bias and predictive performance. Then, a two-step fine-tuning process updates each parameter with different gradient flows defined by its contribution. Extensive experiments with three bias sensitive attributes (gender, skin tone, and age) across four dermatological and two chest X-ray datasets demonstrate that SWiFT can consistently reduce model bias while achieving competitive or even superior diagnostic accuracy under common fairness and accuracy metrics, compared to the state-of-the-art. Specifically, we demonstrate improved model generalization ability as evidenced by superior performance on several out-of-distribution (OOD) datasets.