Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

📅 2026-02-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing dance generation methods that train solely at the skeletal level, neglecting the physical constraints imposed by the human body mesh, which often results in self-penetrations and implausible foot-ground contacts that compromise realism. To mitigate these issues, the authors propose a physics-guided reinforcement learning fine-tuning framework that extracts physical reward signals from the human mesh. This framework integrates imitation learning with foot-ground deviation (FGD) and anti-freezing rewards to fine-tune diffusion models, and further incorporates FGD guidance during inference. Evaluated across multiple dance datasets, the approach significantly enhances the physical plausibility of generated motions, effectively reducing self-penetrations and foot sliding while preserving dynamic expressiveness, thereby producing more realistic and visually convincing dance sequences.

Technology Category

Application Category

📝 Abstract
Despite advances in dance generation, most methods are trained in the skeletal domain and ignore mesh-level physical constraints. As a result, motions that look plausible as joint trajectories often exhibit body self-penetration and Foot-Ground Contact (FGC) anomalies when visualized with a human body mesh, reducing the aesthetic appeal of generated dances and limiting their real-world applications. We address this skeleton-to-mesh gap by deriving physics-based rewards from the body mesh and applying Reinforcement Learning Fine-Tuning (RLFT) to steer the diffusion model toward physically plausible motion synthesis under mesh visualization. Our reward design combines (i) an imitation reward that measures a motion's general plausibility by its imitability in a physical simulator (penalizing penetration and foot skating), and (ii) a Foot-Ground Deviation (FGD) reward with test-time FGD guidance to better capture the dynamic foot-ground interaction in dance. However, we find that the physics-based rewards tend to push the model to generate freezing motions for fewer physical anomalies and better imitability. To mitigate it, we propose an anti-freezing reward to preserve motion dynamics while maintaining physical plausibility. Experiments on multiple dance datasets consistently demonstrate that our method can significantly improve the physical plausibility of generated motions, yielding more realistic and aesthetically pleasing dances. The project page is available at: https://jjd1123.github.io/Skeleton2Stage/
Problem

Research questions and friction points this paper is trying to address.

physically plausible dance generation
skeleton-to-mesh gap
body self-penetration
Foot-Ground Contact (FGC) anomalies
motion realism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning Fine-Tuning
Physically Plausible Motion
Foot-Ground Contact
Anti-Freezing Reward
Mesh-Level Constraints
🔎 Similar Papers
No similar papers found.
J
Jidong Jia
School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240, China
Youjian Zhang
Youjian Zhang
the University of Sydney
computer visionimage processing
H
Huan Fu
Youku, Alibaba, Beijing 100124, China
Dacheng Tao
Dacheng Tao
Nanyang Technological University
artificial intelligencemachine learningcomputer visionimage processingdata mining