RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance

📅 2024-04-22
🏛️ AAAI Conference on Artificial Intelligence
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models frequently produce severe hand structural distortions in portrait generation, hindering practical deployment. To address this, we propose RHanDS, a conditional diffusion framework featuring a novel decoupled dual-guidance mechanism: geometric constraints are imposed via 3D hand mesh priors, while stylistic consistency is enforced using hand-region features extracted from the original image. We further introduce a two-stage training strategy—first optimizing structural controllability, then refining stylistic fidelity—and construct the first large-scale, multi-style paired hand dataset. RHanDS achieves state-of-the-art hand restoration performance across multiple benchmarks, significantly improving hand structural accuracy while preserving texture, lighting, and pose consistency with the input portrait.

Technology Category

Application Category

📝 Abstract
Although diffusion models can generate high-quality human images, their applications are limited by the instability in generating hands with correct structures. In this paper, we introduce RHanDS, a conditional diffusion-based framework designed to refine malformed hands by utilizing decoupled structure and style guidance. The hand mesh reconstructed from the malformed hand offers structure guidance for correcting the structure of the hand, while the malformed hand itself provides style guidance for preserving the style of the hand. To alleviate the mutual interference between style and structure guidance, we introduce a two-stage training strategy and build a series of multi-style hand datasets. In the first stage, we use paired hand images for training to ensure stylistic consistency in hand refining. In the second stage, various hand images generated based on human meshes are used for training, enabling the model to gain control over the hand structure. Experimental results demonstrate that RHanDS can effectively refine hand structure while preserving consistency in hand style.
Problem

Research questions and friction points this paper is trying to address.

Correct malformed hand structures in generated images
Preserve hand style during structural refinement
Decouple structure and style guidance to avoid interference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled structure and style guidance
Two-stage training strategy
Multi-style hand datasets
🔎 Similar Papers
No similar papers found.
Chengrui Wang
Chengrui Wang
Alibaba Group
Computer Vision
P
Pengfei Liu
Xiamen University, Xiamen, China
M
Min Zhou
Alibaba Group, Beijing, China
M
Ming Zeng
Xiamen University, Xiamen, China
X
Xubin Li
Alibaba Group, Beijing, China
Tiezheng Ge
Tiezheng Ge
Senior staff algorithm engineer, Alimama, Alibaba Group
Computer VisionAIGCRecommender Systems
B
Bo Zheng
Alibaba Group, Beijing, China