CharCom: Composable Identity Control for Multi-Character Story Illustration

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Multi-character identity inconsistency remains a critical challenge in text-to-image generation, particularly in complex narrative scenes. Method: This paper proposes a composable LoRA-based character control framework that freezes the base diffusion model and employs a modular, parameter-efficient LoRA architecture. It integrates prompt-aware dynamic scheduling and feature fusion to enable on-the-fly composition of character controllers at inference time—without fine-tuning the foundation model. Contribution/Results: To our knowledge, this is the first approach achieving semantic alignment, temporal coherence, and scalable multi-character personalization. Experiments demonstrate significant improvements in character consistency, semantic fidelity, and cross-image coherence—especially in crowded or intricate scenes—while maintaining high visual quality and computational efficiency, thus exhibiting strong potential for practical deployment.

Technology Category

Application Category

📝 Abstract

Ensuring character identity consistency across varying prompts remains a fundamental limitation in diffusion-based text-to-image generation. We propose CharCom, a modular and parameter-efficient framework that achieves character-consistent story illustration through composable LoRA adapters, enabling efficient per-character customization without retraining the base model. Built on a frozen diffusion backbone, CharCom dynamically composes adapters at inference using prompt-aware control. Experiments on multi-scene narratives demonstrate that CharCom significantly enhances character fidelity, semantic alignment, and temporal coherence. It remains robust in crowded scenes and enables scalable multi-character generation with minimal overhead, making it well-suited for real-world applications such as story illustration and animation.

Problem

Research questions and friction points this paper is trying to address.

Ensuring character identity consistency in diffusion-based text-to-image generation

Achieving character-consistent story illustration without retraining base models

Enabling scalable multi-character generation with minimal computational overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular framework using composable LoRA adapters

Dynamic adapter composition with prompt-aware control

Enables scalable multi-character generation without retraining

🔎 Similar Papers

No similar papers found.

Authors to Follow