MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Existing methods struggle to achieve high-quality, training-free fusion of multiple LoRA adapters, particularly under complex multi-visual-element composition scenarios, where spatial inconsistency and concept distortion frequently occur. To address this, we propose a frequency-domain-guided adaptive multi-LoRA fusion framework featuring a novel dual-dimensional dynamic selection mechanism—aware of both diffusion timesteps and frequency bands—that enables content-aware weighted aggregation. Our method requires no additional training or learnable parameters, yet significantly improves compositional fidelity and spatial coherence. Evaluated on the ComposLoRA benchmark, it comprehensively outperforms state-of-the-art approaches, delivering substantial gains in image quality and robustness across diverse styles and multi-concept generation tasks.

Technology Category

Application Category

📝 Abstract

Low-Rank Adaptation (LoRA) has gained prominence as a computationally efficient method for fine-tuning generative models, enabling distinct visual concept synthesis with minimal overhead. However, current methods struggle to effectively merge multiple LoRA adapters without training, particularly in complex compositions involving diverse visual elements. We introduce MultLFG, a novel framework for training-free multi-LoRA composition that utilizes frequency-domain guidance to achieve adaptive fusion of multiple LoRAs. Unlike existing methods that uniformly aggregate concept-specific LoRAs, MultLFG employs a timestep and frequency subband adaptive fusion strategy, selectively activating relevant LoRAs based on content relevance at specific timesteps and frequency bands. This frequency-sensitive guidance not only improves spatial coherence but also provides finer control over multi-LoRA composition, leading to more accurate and consistent results. Experimental evaluations on the ComposLoRA benchmark reveal that MultLFG substantially enhances compositional fidelity and image quality across various styles and concept sets, outperforming state-of-the-art baselines in multi-concept generation tasks. Code will be released.

Problem

Research questions and friction points this paper is trying to address.

Effective merging of multiple LoRA adapters without training

Achieving adaptive fusion of LoRAs in frequency domain

Improving multi-LoRA composition fidelity and image quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free multi-LoRA composition framework

Frequency-domain guidance for adaptive fusion

Timestep and frequency subband selective activation

🔎 Similar Papers

No similar papers found.