PRISM: Reducing Spurious Implicit Biases in Vision-Language Models with LLM-Guided Embedding Projection

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Vision-language models (VLMs) such as CLIP inherit implicit biases from training data, leading to spurious correlations that harm fairness and generalization. Method: We propose PRISM—a task-agnostic, data-free debiasing framework that requires no external data or predefined bias categories. It operates in two stages: (1) leveraging large language models to automatically generate text descriptions containing spurious associations, thereby constructing bias-aware contextual prompts; and (2) learning an embedding-space projection function via contrastive loss to disentangle bias dimensions while preserving image-text alignment. Contribution/Results: PRISM is the first method to enable fully automated, unsupervised bias-context construction and bias disentanglement. On benchmarks including Waterbirds and CelebA, it significantly reduces spurious correlations—measured via bias amplification and worst-group accuracy—while maintaining near-original performance on primary vision-language tasks. This establishes a scalable, low-dependency paradigm for VLM debiasing.

Technology Category

Application Category

📝 Abstract

We introduce Projection-based Reduction of Implicit Spurious bias in vision-language Models (PRISM), a new data-free and task-agnostic solution for bias mitigation in VLMs like CLIP. VLMs often inherit and amplify biases in their training data, leading to skewed predictions. PRISM is designed to debias VLMs without relying on predefined bias categories or additional external data. It operates in two stages: first, an LLM is prompted with simple class prompts to generate scene descriptions that contain spurious correlations. Next, PRISM uses our novel contrastive-style debiasing loss to learn a projection that maps the embeddings onto a latent space that minimizes spurious correlations while preserving the alignment between image and text embeddings.Extensive experiments demonstrate that PRISM outperforms current debiasing methods on the commonly used Waterbirds and CelebA datasets We make our code public at: https://github.com/MahdiyarMM/PRISM.

Problem

Research questions and friction points this paper is trying to address.

Mitigates implicit biases in vision-language models

Debiases without predefined categories or extra data

Reduces spurious correlations while preserving alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-generated scene descriptions for bias identification

Contrastive-style debiasing loss for embedding projection

Data-free task-agnostic VLM bias mitigation

🔎 Similar Papers

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings