Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weighting

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

175K/year
🤖 AI Summary
This work addresses the degradation in image quality and insufficient identity fidelity commonly observed in multi-concept LoRA composition due to interference among concepts. To mitigate this issue, the authors propose a training-free, prompt-aware weighting mechanism that dynamically fuses outputs from multiple LoRA modules during inference, introducing two novel strategies: W-Switch and W-Composite. A comprehensive evaluation framework is established based on automatic image segmentation, integrating semantic analysis, similarity metrics, and LLM-assisted assessment to holistically measure compositional performance. Experiments on the ComposLoRA benchmark demonstrate that the proposed method significantly outperforms existing approaches in visual quality, identity preservation, and compositional capability.
📝 Abstract
Low-Rank Adaptation (LoRA) successfully enables personalization in text-to-image generation by adapting pre-trained diffusion models to specific visual concepts and styles. However, extending such models to multi-concept customization remains challenging. Naively combining multiple LoRA weights or their outputs often leads to interference among concepts, resulting in degraded visual quality and reduced fidelity to the reference images of individual concepts. This paper proposes a simple yet effective approach for multi-concept customization by optimally combining the outputs of multiple LoRA modules. We leverage the relative importance of each concept during generation, as inferred from its corresponding prompt tokens and introduce two methods, W-Switch and W-Composite, that employ a prompt-aware importance weighting strategy in which each LoRA is weighted according to the semantic influence of its trigger words in the target prompt. In addition, we extend existing quantitative evaluation metrics by proposing a new image-based similarity evaluation framework that assesses image fidelity and identity preservation through comparisons between real-world reference images and automatically segmented concept regions from generated images. We evaluate our approach on the ComposLoRA testbed and demonstrate consistent improvements over existing state-of-the-art methods in terms of visual quality, identity preservation and compositionality. Qualitative evaluations, including a Large Language Model (LLM) based assessment and a user study, further validate the effectiveness of the proposed methods and align with the newly introduced quantitative image-based metrics. Our code is available at https://github.com/GeorgeTsoumplekas/Prompt-Aware-Multi-LoRA-Composition.
Problem

Research questions and friction points this paper is trying to address.

multi-concept customization
LoRA composition
concept interference
text-to-image generation
visual fidelity
Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA composition
prompt-aware weighting
multi-concept customization
training-free adaptation
image-based evaluation
🔎 Similar Papers
No similar papers found.