CoCo-Bot: Energy-based Composable Concept Bottlenecks for Interpretable Generative Models

📅 2025-07-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing concept bottleneck models (CBMs) rely on auxiliary visual cues to compensate for missing concept information during generation, undermining interpretability and compositional reasoning. This paper introduces CoCo-Bot—the first posterior compositional CBM for generative modeling that operates without auxiliary visual inputs. Its core innovation is an energy-based conceptual space generator, integrating StyleGAN2’s structural priors with diffusion-guided optimization to enable purely concept-driven image synthesis. CoCo-Bot supports robust, human-interpretable interventions—including arbitrary cross-concept composition and logical negation—ensuring all generative information flows exclusively through semantically meaningful, human-understandable concepts. Evaluated on CelebA-HQ, CoCo-Bot achieves state-of-the-art image fidelity while significantly improving concept-level controllability, interpretability, and editing flexibility. It is the first framework to realize truly end-to-end, intervenable, and compositionally expressive concept bottleneck generation.

Technology Category

Application Category

📝 Abstract
Concept Bottleneck Models (CBMs) provide interpretable and controllable generative modeling by routing generation through explicit, human-understandable concepts. However, previous generative CBMs often rely on auxiliary visual cues at the bottleneck to compensate for information not captured by the concepts, which undermines interpretability and compositionality. We propose CoCo-Bot, a post-hoc, composable concept bottleneck generative model that eliminates the need for auxiliary cues by transmitting all information solely through explicit concepts. Guided by diffusion-based energy functions, CoCo-Bot supports robust post-hoc interventions-such as concept composition and negation-across arbitrary concepts. Experiments using StyleGAN2 pre-trained on CelebA-HQ show that CoCo-Bot improves concept-level controllability and interpretability, while maintaining competitive visual quality.
Problem

Research questions and friction points this paper is trying to address.

Eliminates need for auxiliary cues in generative models
Enhances interpretability and compositionality of concept bottlenecks
Improves concept-level controllability without sacrificing visual quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Eliminates auxiliary cues with explicit concepts
Uses diffusion-based energy functions
Enables post-hoc concept composition and negation
🔎 Similar Papers