SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion

๐Ÿ“… 2024-05-05
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 6
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing motion style transfer methods predominantly adopt dual-stream architectures, which often neglect intrinsic correlations between content and style motionsโ€”leading to information loss, misalignment, and insufficient modeling of long-range temporal dependencies, thereby producing distorted and incoherent motions. To address these limitations, we propose SMCD, the first diffusion-based framework conditioned on style motion. We introduce the novel Motion Style Mamba (MSM) module to efficiently capture long-term sequential motion dependencies, and design a dual-content consistency loss to enhance generation stability. Our approach enables disentangled motion feature representation, conditional reconstruction, and high-fidelity style transfer. Extensive qualitative and quantitative evaluations demonstrate that SMCD consistently outperforms state-of-the-art methods, achieving significant improvements in motion realism, temporal coherence, and style fidelity. The method effectively enriches motion diversity and naturalness for virtual human avatars.

Technology Category

Application Category

๐Ÿ“ Abstract
Motion style transfer is a significant research direction in multimedia applications. It enables the rapid switching of different styles of the same motion for virtual digital humans, thus vastly increasing the diversity and realism of movements. It is widely applied in multimedia scenarios such as movies, games, and the Metaverse. However, most of the current work in this field adopts the GAN, which may lead to instability and convergence issues, making the final generated motion sequence somewhat chaotic and unable to reflect a highly realistic and natural style. To address these problems, we consider style motion as a condition and propose the Style Motion Conditioned Diffusion (SMCD) framework for the first time, which can more comprehensively learn the style features of motion. Moreover, we apply Mamba model for the first time in the motion style transfer field, introducing the Motion Style Mamba (MSM) module to handle longer motion sequences. Thirdly, aiming at the SMCD framework, we propose Diffusion-based Content Consistency Loss and Content Consistency Loss to assist the overall framework's training. Finally, we conduct extensive experiments. The results reveal that our method surpasses state-of-the-art methods in both qualitative and quantitative comparisons, capable of generating more realistic motion sequences.
Problem

Research questions and friction points this paper is trying to address.

Overcoming content-style motion relationship neglect in transfer
Improving temporal dependency learning in long motion sequences
Enhancing realism and coherence in motion style transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Motion Style Diffusion framework for interaction
Motion Style Mamba denoiser for sequence modeling
Diffusion-based consistency losses for realistic transfer
๐Ÿ”Ž Similar Papers
No similar papers found.
Z
Ziyun Qian
Academy for Engineering and Technology, Fudan University, Shanghai, China
Z
Zeyu Xiao
Academy for Engineering and Technology, Fudan University, Shanghai, China
Z
Zhenyi Wu
Academy for Engineering and Technology, Fudan University, Shanghai, China
Dingkang Yang
Dingkang Yang
ByteDance
Multimodal LearningGenerative AIEmbodied AI
Mingcheng Li
Mingcheng Li
Fudan University
Shunli Wang
Shunli Wang
Academy for Engineering and Technology, Fudan University, Shanghai, China
S
Shuai Wang
Academy for Engineering and Technology, Fudan University, Shanghai, China
D
Dongliang Kou
Academy for Engineering and Technology, Fudan University, Shanghai, China
Lihua Zhang
Lihua Zhang
Wuhan University
computational biologybioinformaticsdata mining