Mitigating Diffusion Model Hallucinations with Dynamic Guidance

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

Diffusion models often generate structurally inconsistent hallucinated samples due to excessive inter-modal smoothing, violating the support of the true data distribution. To address this, we propose a dynamic guidance method operating during the generative sampling phase: it identifies artifact-sensitive directions via directional score function analysis and selectively sharpens gradients along those directions to suppress hallucinations while preserving semantic interpolation validity and diversity. Unlike prior approaches, our method directly regulates hallucination *during* forward sampling—without requiring post-hoc filtering—and introduces a direction-selective mechanism for fine-grained control over inter-modal smoothing behavior. Evaluated on both controlled and natural image datasets, our approach significantly reduces structural hallucinations, improves structural consistency and visual fidelity of generated samples, and outperforms mainstream baseline methods.

Technology Category

Application Category

📝 Abstract

Diffusion models, despite their impressive demos, often produce hallucinatory samples with structural inconsistencies that lie outside of the support of the true data distribution. Such hallucinations can be attributed to excessive smoothing between modes of the data distribution. However, semantic interpolations are often desirable and can lead to generation diversity, thus we believe a more nuanced solution is required. In this work, we introduce Dynamic Guidance, which tackles this issue. Dynamic Guidance mitigates hallucinations by selectively sharpening the score function only along the pre-determined directions known to cause artifacts, while preserving valid semantic variations. To our knowledge, this is the first approach that addresses hallucinations at generation time rather than through post-hoc filtering. Dynamic Guidance substantially reduces hallucinations on both controlled and natural image datasets, significantly outperforming baselines.

Problem

Research questions and friction points this paper is trying to address.

Mitigating structural inconsistencies in diffusion model outputs

Reducing hallucinations while preserving semantic generation diversity

Selectively sharpening score functions to prevent artifact-causing directions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Guidance selectively sharpens score function

Targets predetermined artifact-causing directions only

Addresses hallucinations during generation not post-hoc

🔎 Similar Papers

No similar papers found.