FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the lack of pose and illumination controllability in natural-language-driven personalized virtual try-on for fashion e-commerce, this paper proposes the first end-to-end text-to-pose-to-relighting generation framework. Methodologically, it eliminates reliance on explicit pose annotations by employing text-guided 2D pose estimation for semantic alignment; integrates diffusion models to synthesize high-fidelity dressed images; and introduces a lightweight, learnable relighting module enabling photorealistic rendering under arbitrary illumination conditions. Experimental results demonstrate that the framework significantly outperforms existing methods in fine-grained pose generation, clothing detail preservation, and illumination consistency. It achieves superior visual quality and practical applicability for e-commerce scenarios, establishing new state-of-the-art performance in controllable virtual try-on.

Technology Category

Application Category

📝 Abstract

Realistic and controllable garment visualization is critical for fashion e-commerce, where users expect personalized previews under diverse poses and lighting conditions. Existing methods often rely on predefined poses, limiting semantic flexibility and illumination adaptability. To address this, we introduce FashionPose, the first unified text-to-pose-to-relighting generation framework. Given a natural language description, our method first predicts a 2D human pose, then employs a diffusion model to generate high-fidelity person images, and finally applies a lightweight relighting module, all guided by the same textual input. By replacing explicit pose annotations with text-driven conditioning, FashionPose enables accurate pose alignment, faithful garment rendering, and flexible lighting control. Experiments demonstrate fine-grained pose synthesis and efficient, consistent relighting, providing a practical solution for personalized virtual fashion display.

Problem

Research questions and friction points this paper is trying to address.

Generates personalized fashion images from text descriptions

Overcomes limitations of predefined poses and lighting conditions

Unifies pose prediction, image generation, and relighting control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Text-driven 2D human pose prediction

Diffusion model for high-fidelity image generation

Lightweight relighting module for flexible control

🔎 Similar Papers

No similar papers found.

Authors to Follow