Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inherent tension between fairness and privacy in LoRA-based fine-tuning. We propose a distributed fair fine-tuning framework that requires no access to sensitive attributes or their predictors. Methodologically, it integrates low-rank adaptation, sensitive information forgetting, adversarial training, and an orthogonality loss function within a collaborative training paradigm between model developers and fairness auditors, enforcing strict demographic privacy constraints. Our key contribution is breaking the conventional dependence on sensitive attributes in fair learning—achieving fairness-aware fine-tuning for the first time without any exposure of sensitive information. Experiments on CelebA and UTK-Face demonstrate that the orthogonality loss significantly reduces bias while preserving model utility; adversarial training effectively improves both false positive rate parity and demographic parity; and the overall framework achieves measurable fairness gains under strong privacy guarantees.

Technology Category

Application Category

📝 Abstract
Pre-trained foundation models can be adapted for specific tasks using Low-Rank Adaptation (LoRA). However, the fairness properties of these adapted classifiers remain underexplored. Existing fairness-aware fine-tuning methods rely on direct access to sensitive attributes or their predictors, but in practice, these sensitive attributes are often held under strict consumer privacy controls, and neither the attributes nor their predictors are available to model developers, hampering the development of fair models. To address this issue, we introduce a set of LoRA-based fine-tuning methods that can be trained in a distributed fashion, where model developers and fairness auditors collaborate without sharing sensitive attributes or predictors. In this paper, we evaluate three such methods - sensitive unlearning, adversarial training, and orthogonality loss - against a fairness-unaware baseline, using experiments on the CelebA and UTK-Face datasets with an ImageNet pre-trained ViT-Base model. We find that orthogonality loss consistently reduces bias while maintaining or improving utility, whereas adversarial training improves False Positive Rate Parity and Demographic Parity in some cases, and sensitive unlearning provides no clear benefit. In tasks where significant biases are present, distributed fairness-aware fine-tuning methods can effectively eliminate bias without compromising consumer privacy and, in most cases, improve model utility.
Problem

Research questions and friction points this paper is trying to address.

Fairness in adapted classifiers under privacy constraints.
Distributed fine-tuning without sharing sensitive attributes.
Reducing bias while maintaining or improving model utility.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA-based fine-tuning without sensitive data sharing
Distributed collaboration between developers and auditors
Orthogonality loss reduces bias, maintains utility
🔎 Similar Papers