Past Movements-Guided Motion Representation Learning for Human Motion Prediction

📅 2024-08-04

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

3D human motion prediction suffers from weak dynamic modeling, motion distortion, or static predictions in end-to-end regression models—primarily due to reliance on superficial appearance features rather than underlying motion structures. To address this, we propose a self-supervised two-stage framework: (1) pretraining via past-sequence self-reconstruction and guided reconstruction of “past motion of future sequences”, and (2) fine-tuning on downstream motion forecasting tasks. Our approach introduces a velocity-aware masking strategy that prioritizes highly dynamic joints, and explicitly models the temporal guidance relationship between historical motion and future predictions. Evaluated on Human3.6M, 3DPW, and AMASS, our method achieves an average reduction of 8.8% in prediction error over state-of-the-art methods, demonstrating superior accuracy and dynamic fidelity.

Technology Category

Application Category

📝 Abstract

Human motion prediction based on 3D skeleton is a significant challenge in computer vision, primarily focusing on the effective representation of motion. In this paper, we propose a self-supervised learning framework designed to enhance motion representation. This framework consists of two stages: first, the network is pretrained through the self-reconstruction of past sequences, and the guided reconstruction of future sequences based on past movements. We design a velocity-based mask strategy to focus on the joints with large-scale moving. Subsequently, the pretrained network undergoes finetuning for specific tasks. Self-reconstruction, guided by patterns of past motion, substantially improves the model's ability to represent the spatiotemporal relationships among joints but also captures the latent relationships between past and future sequences. This capability is crucial for motion prediction tasks that solely depend on historical motion data. By employing this straightforward yet effective training paradigm, our method outperforms existing extit{state-of-the-art} methods, reducing the average prediction errors by 8.8% across Human3.6M, 3DPW, and AMASS datasets. The code is available at https://github.com/JunyuShi02/PMG-MRL.

Problem

Research questions and friction points this paper is trying to address.

Predicting coherent future human motions from observed sequences

Overcoming representation shortcutting in motion prediction models

Learning meaningful motion structure without regression interference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage self-supervised framework decouples representation learning

Past-future self-reconstruction with velocity-based masking strategy

Lightweight future-text prediction head enables joint motion optimization

🔎 Similar Papers

No similar papers found.