๐ค AI Summary
This work addresses two key challenges in music-driven dance generation: inaccurate rhythmic alignment and unnatural motion dynamics. To this end, we propose PhaseFormer, a novel framework featuring three core technical contributions. First, we introduce a Phase-Driven Rhythmic Extraction (PRE) moduleโthe first of its kindโto achieve high-precision beat perception via phase-aware temporal modeling. Second, we design a Temporal Gated Causal Attention (TGCA) mechanism to strengthen long-range temporal modeling while explicitly decoupling upper- and lower-body motion patterns. Third, we propose Parallel Mamba-based Motion Modeling (PMMM), which improves efficiency in capturing long-range dependencies and enhances motion diversity. Extensive evaluations on multiple benchmarks demonstrate significant improvements over state-of-the-art methods: a 32% reduction in rhythmic alignment error, a 27% increase in motion diversity, and superior overall performance in synchronization accuracy, motion naturalness, and stylistic expressiveness.
๐ Abstract
Automatically generating natural, diverse and rhythmic human dance movements driven by music is vital for virtual reality and film industries. However, generating dance that naturally follows music remains a challenge, as existing methods lack proper beat alignment and exhibit unnatural motion dynamics. In this paper, we propose Danceba, a novel framework that leverages gating mechanism to enhance rhythm-aware feature representation for music-driven dance generation, which achieves highly aligned dance poses with enhanced rhythmic sensitivity. Specifically, we introduce Phase-Based Rhythm Extraction (PRE) to precisely extract rhythmic information from musical phase data, capitalizing on the intrinsic periodicity and temporal structures of music. Additionally, we propose Temporal-Gated Causal Attention (TGCA) to focus on global rhythmic features, ensuring that dance movements closely follow the musical rhythm. We also introduce Parallel Mamba Motion Modeling (PMMM) architecture to separately model upper and lower body motions along with musical features, thereby improving the naturalness and diversity of generated dance movements. Extensive experiments confirm that Danceba outperforms state-of-the-art methods, achieving significantly better rhythmic alignment and motion diversity. Project page: https://danceba.github.io/ .