Filtered-Guided Diffusion: Fast Filter Guidance for Black-Box Diffusion Models

📅 2023-06-29

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Existing diffusion models suffer from low sampling efficiency, high memory overhead, and limited generation diversity in zero-shot image-to-image (I2I) translation. This paper proposes a training-free, fully black-box filtering guidance method: lightweight, adaptive filtering operations are applied at the input of each diffusion step, enabling model- and sampler-agnostic intervention. Key contributions include: (i) the first architecture- and sampler-agnostic universal filtering guidance; (ii) continuous, tunable guidance strength; and (iii) a novel, general interpretability perspective for self-attention mechanisms. Our method operates via gradient-free, iterative input reweighting—requiring no architectural modification or parameter optimization. Evaluated across multiple I2I tasks, it matches or surpasses task-specific state-of-the-art methods in structural fidelity while incurring negligible inference overhead.

📝 Abstract

Recent advances in diffusion-based generative models have shown incredible promise for Image-to-Image translation and editing. Most recent work in this space relies on additional training or architecture-specific adjustments to the diffusion process. In this work, we show that much of this low-level control can be achieved without additional training or any access to features of the diffusion model. Our method simply applies a filter to the input of each diffusion step based on the output of the previous step in an adaptive manner. Notably, this approach does not depend on any specific architecture or sampler and can be done without access to internal features of the network, making it easy to combine with other techniques, samplers, and diffusion architectures. Furthermore, it has negligible cost to performance, and allows for more continuous adjustment of guidance strength than other approaches. We show FGD offers a fast and strong baseline that is competitive with recent architecture-dependent approaches. Furthermore, FGD can also be used as a simple add-on to enhance the structural guidance of other state-of-the-art I2I methods. Finally, our derivation of this method helps to understand the impact of self attention, a key component of other recent architecture-specific I2I approaches, in a more architecture-independent way. Project page: https://github.com/jaclyngu/FilteredGuidedDiffusion

Problem

Research questions and friction points this paper is trying to address.

Overcoming high computational costs in diffusion-based image generation methods

Addressing limited output diversity from deterministic sampling approaches

Enhancing control over guidance strength and frequency in image translation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses fast filtering operations for diffusion guidance

Works with non-deterministic samplers for variety

Enables efficient sampling with multiple parameters

🔎 Similar Papers

Learning Diffusion Priors from Observations by Expectation Maximization