SCALEX: Scalable Concept and Latent Exploration for Diffusion Models

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Image generation models often encode societal biases—such as gender, race, and occupational stereotypes—yet existing bias analysis methods are constrained by reliance on predefined categories or manual interpretation of latent directions, limiting scalability and hindering discovery of unknown biases. To address this, we propose the first zero-shot, annotation-free, and fine-tuning–free interpretable framework that directly extracts semantic directions in the H-space via natural language prompts. Our method automatically maps textual prompts to latent space using large-scale concept-alignment evaluation. It enables systematic, cross-concept comparison and unsupervised revelation of internal concept clustering and implicit organizational structure within diffusion models. Experiments quantitatively measure gender bias in occupational prompts and assess semantic alignment among identity descriptors, demonstrating substantial improvements in both scalability and interpretability of bias detection.

Technology Category

Application Category

📝 Abstract

Image generation models frequently encode social biases, including stereotypes tied to gender, race, and profession. Existing methods for analyzing these biases in diffusion models either focus narrowly on predefined categories or depend on manual interpretation of latent directions. These constraints limit scalability and hinder the discovery of subtle or unanticipated patterns. We introduce SCALEX, a framework for scalable and automated exploration of diffusion model latent spaces. SCALEX extracts semantically meaningful directions from H-space using only natural language prompts, enabling zero-shot interpretation without retraining or labelling. This allows systematic comparison across arbitrary concepts and large-scale discovery of internal model associations. We show that SCALEX detects gender bias in profession prompts, ranks semantic alignment across identity descriptors, and reveals clustered conceptual structure without supervision. By linking prompts to latent directions directly, SCALEX makes bias analysis in diffusion models more scalable, interpretable, and extensible than prior approaches.

Problem

Research questions and friction points this paper is trying to address.

Detecting social biases in diffusion models' image generation

Automating latent space exploration without manual interpretation

Enabling scalable analysis of arbitrary concepts and associations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts semantic directions from H-space automatically

Uses natural language prompts for zero-shot interpretation

Enables systematic comparison across arbitrary concepts

🔎 Similar Papers

No similar papers found.

Authors to Follow