AttnMod: Attention-Based New Art Styles

📅 2024-09-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models struggle to generate artistic styles not explicitly described in text prompts. To address this, we propose a cross-attention intervention method that requires neither prompt modification nor model fine-tuning. By treating the text-guided cross-attention layers in the UNet as editable interfaces, our approach introduces dynamic attention masking and weight remapping to enable fine-grained modulation of attention maps during denoising. This is the first method to achieve intent-driven zero-shot artistic style synthesis—generating novel, previously unparameterized styles such as contour distortion, color diffusion, and material concretization—while preserving semantic fidelity. Unlike prompt engineering or model adaptation paradigms, our technique transcends inherent constraints on stylistic expressivity, offering a lightweight, efficient, and interpretable framework for controllable image generation.

Technology Category

Application Category

📝 Abstract
Imagine a human artist looking at the generated photo of a diffusion model, and hoping to create a painting out of it. There could be some feature of the object in the photo that the artist wants to emphasize, some color to disperse, some silhouette to twist, or some part of the scene to be materialized. These intentions can be viewed as the modification of the cross attention from the text prompt onto UNet, during the desoising diffusion. This work presents AttnMod, to modify attention for creating new unpromptable art styles out of existing diffusion models. The style-creating behavior is studied across different setups.
Problem

Research questions and friction points this paper is trying to address.

Modulates cross-attention to create new art styles
Alters text prompt conditioning during image denoising
Enables stylistic transformations without retraining models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free attention modulation technique
Alters text prompt conditioning via attention
Enables diverse stylistic transformations without retraining
🔎 Similar Papers
No similar papers found.
S
Shih-Chieh Su