🤖 AI Summary
3D human motion prediction suffers from weak dynamic modeling, motion distortion, or static predictions in end-to-end regression models—primarily due to reliance on superficial appearance features rather than underlying motion structures. To address this, we propose a self-supervised two-stage framework: (1) pretraining via past-sequence self-reconstruction and guided reconstruction of “past motion of future sequences”, and (2) fine-tuning on downstream motion forecasting tasks. Our approach introduces a velocity-aware masking strategy that prioritizes highly dynamic joints, and explicitly models the temporal guidance relationship between historical motion and future predictions. Evaluated on Human3.6M, 3DPW, and AMASS, our method achieves an average reduction of 8.8% in prediction error over state-of-the-art methods, demonstrating superior accuracy and dynamic fidelity.
📝 Abstract
Human motion prediction based on 3D skeleton is a significant challenge in computer vision, primarily focusing on the effective representation of motion. In this paper, we propose a self-supervised learning framework designed to enhance motion representation. This framework consists of two stages: first, the network is pretrained through the self-reconstruction of past sequences, and the guided reconstruction of future sequences based on past movements. We design a velocity-based mask strategy to focus on the joints with large-scale moving. Subsequently, the pretrained network undergoes finetuning for specific tasks. Self-reconstruction, guided by patterns of past motion, substantially improves the model's ability to represent the spatiotemporal relationships among joints but also captures the latent relationships between past and future sequences. This capability is crucial for motion prediction tasks that solely depend on historical motion data. By employing this straightforward yet effective training paradigm, our method outperforms existing extit{state-of-the-art} methods, reducing the average prediction errors by 8.8% across Human3.6M, 3DPW, and AMASS datasets. The code is available at https://github.com/JunyuShi02/PMG-MRL.