Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation

๐Ÿ“… 2025-03-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses two key challenges in music-driven dance generation: inaccurate rhythmic alignment and unnatural motion dynamics. To this end, we propose PhaseFormer, a novel framework featuring three core technical contributions. First, we introduce a Phase-Driven Rhythmic Extraction (PRE) moduleโ€”the first of its kindโ€”to achieve high-precision beat perception via phase-aware temporal modeling. Second, we design a Temporal Gated Causal Attention (TGCA) mechanism to strengthen long-range temporal modeling while explicitly decoupling upper- and lower-body motion patterns. Third, we propose Parallel Mamba-based Motion Modeling (PMMM), which improves efficiency in capturing long-range dependencies and enhances motion diversity. Extensive evaluations on multiple benchmarks demonstrate significant improvements over state-of-the-art methods: a 32% reduction in rhythmic alignment error, a 27% increase in motion diversity, and superior overall performance in synchronization accuracy, motion naturalness, and stylistic expressiveness.

Technology Category

Application Category

๐Ÿ“ Abstract
Automatically generating natural, diverse and rhythmic human dance movements driven by music is vital for virtual reality and film industries. However, generating dance that naturally follows music remains a challenge, as existing methods lack proper beat alignment and exhibit unnatural motion dynamics. In this paper, we propose Danceba, a novel framework that leverages gating mechanism to enhance rhythm-aware feature representation for music-driven dance generation, which achieves highly aligned dance poses with enhanced rhythmic sensitivity. Specifically, we introduce Phase-Based Rhythm Extraction (PRE) to precisely extract rhythmic information from musical phase data, capitalizing on the intrinsic periodicity and temporal structures of music. Additionally, we propose Temporal-Gated Causal Attention (TGCA) to focus on global rhythmic features, ensuring that dance movements closely follow the musical rhythm. We also introduce Parallel Mamba Motion Modeling (PMMM) architecture to separately model upper and lower body motions along with musical features, thereby improving the naturalness and diversity of generated dance movements. Extensive experiments confirm that Danceba outperforms state-of-the-art methods, achieving significantly better rhythmic alignment and motion diversity. Project page: https://danceba.github.io/ .
Problem

Research questions and friction points this paper is trying to address.

Generating dance movements aligned with music rhythm
Enhancing rhythmic sensitivity in dance pose generation
Improving naturalness and diversity of music-driven dance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Phase-Based Rhythm Extraction for precise music analysis
Temporal-Gated Causal Attention for global rhythm focus
Parallel Mamba Motion Modeling for natural dance generation
๐Ÿ”Ž Similar Papers
No similar papers found.
Congyi Fan
Congyi Fan
Harbin Engineering University
Computer VisionMultimodality
J
Jian Guan
Harbin Engineering University
X
Xuanjia Zhao
Harbin Engineering University
Dongli Xu
Dongli Xu
KU Leuven
Computer Vision
Youtian Lin
Youtian Lin
Nanjing University
3D VisionComputer VisionMachine Learning
T
Tong Ye
Harbin Engineering University
Pengming Feng
Pengming Feng
Senior Engineer
Machine learningremote sensing
H
Haiwei Pan
Harbin Engineering University