Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Vision-language models (VLMs) exhibit pervasive demographic bias, and conventional coordinate-wise debiasing methods suffer from incomplete mitigation due to feature entanglement. To address this, we propose Subspace Projection Debiasing (SPD), the first geometric framework that models bias as a low-dimensional, linearly decodable subspace in the embedding space. SPD removes bias via orthogonal projection onto the null space of this subspace while preserving the semantic mean component—critical for maintaining downstream task performance. Our method integrates linear subspace analysis, geometric projection, and mean compensation, and is applicable to alignment-sensitive tasks including zero-shot classification, cross-modal retrieval, and generation. Evaluated across four fairness metrics, SPD achieves an average improvement of 18.5%, significantly outperforming state-of-the-art debiasing approaches. Moreover, it preserves robust downstream accuracy and demonstrates strong generalization across diverse datasets.

Technology Category

Application Category

📝 Abstract

Vision-Language Models (VLMs) have become indispensable for multimodal reasoning, yet their representations often encode and amplify demographic biases, resulting in biased associations and misaligned predictions in downstream tasks. Such behavior undermines fairness and distorts the intended alignment between vision and language. Recent post-hoc approaches attempt to mitigate bias by replacing the most attribute-correlated embedding coordinates with neutral values. However, our systematic analysis reveals three critical failures of this coordinate-wise approach: feature entanglement, poor cross-dataset generalization, and incomplete bias removal. We find that bias is not localized to a few coordinates but is instead distributed across a few linear subspaces. To address these limitations, we propose $ extbf{S}$ubspace $ extbf{P}$rojection $ extbf{D}$ebiasing ($ extbf{SPD}$), a geometrically principled framework that identifies and removes the entire subspace of linearly decodable bias while reinserting a neutral mean component to preserve semantic fidelity. Extensive experiments across zero-shot classification, text-to-image retrieval, and image generation validate the effectiveness of SPD: our method achieves more robust debiasing with an average improvement of $18.5%$ across four fairness metrics, while maintaining minimal loss in task performance compared to the best debiasing baseline.

Problem

Research questions and friction points this paper is trying to address.

VLMs encode demographic biases causing unfair vision-language misalignment

Coordinate-wise debiasing fails due to feature entanglement and poor generalization

Bias distributes across linear subspaces rather than isolated coordinates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bias removal via linear subspace projection

Neutral mean reinsertion for semantic preservation

Geometric framework improving debiasing robustness

🔎 Similar Papers

Can We Talk Models Into Seeing the World Differently?