Autoregressive Flow Matching for Motion Prediction

📅 2025-12-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing motion prediction models are constrained by narrow-distribution training data, limiting their ability to model complex dynamics and generalize to long-horizon, cross-scenario tasks. To address this, we propose Autoregressive Flow Matching (ARFM), the first framework to extend flow matching to probabilistic modeling of continuous-time sequences for high-fidelity long-term prediction of human and robotic point trajectories. ARFM integrates autoregressive temporal modeling with multi-source video-driven training and supports trajectory-conditioned downstream task enhancement. Evaluated on a newly constructed human/robot motion prediction benchmark, ARFM achieves significant improvements: +23.6% in L2 trajectory fidelity for long-horizon generation and +8.4% in action classification accuracy. These results demonstrate that ARFM effectively overcomes key bottlenecks in modeling complex dynamics and enabling cross-scenario generalization.

Technology Category

Application Category

📝 Abstract
Motion prediction has been studied in different contexts with models trained on narrow distributions and applied to downstream tasks in human motion prediction and robotics. Simultaneously, recent efforts in scaling video prediction have demonstrated impressive visual realism, yet they struggle to accurately model complex motions despite massive scale. Inspired by the scaling of video generation, we develop autoregressive flow matching (ARFM), a new method for probabilistic modeling of sequential continuous data and train it on diverse video datasets to generate future point track locations over long horizons. To evaluate our model, we develop benchmarks for evaluating the ability of motion prediction models to predict human and robot motion. Our model is able to predict complex motions, and we demonstrate that conditioning robot action prediction and human motion prediction on predicted future tracks can significantly improve downstream task performance. Code and models publicly available at: https://github.com/Johnathan-Xie/arfm-motion-prediction.
Problem

Research questions and friction points this paper is trying to address.

Develops autoregressive flow matching for sequential continuous data modeling
Trains model on diverse videos to predict long-term future point tracks
Improves robot action and human motion prediction via future track conditioning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoregressive flow matching models sequential continuous data
Trains on diverse video datasets for long-horizon predictions
Conditions robot and human motion on predicted future tracks
🔎 Similar Papers
No similar papers found.
J
Johnathan Xie
Stanford University
Stefan Stojanov
Stefan Stojanov
Postdoc at Stanford Vision Lab and Neuro AI Lab
Computer VisionMachine Learning
Cristobal Eyzaguirre
Cristobal Eyzaguirre
Ph.D. Student, Stanford University
D
Daniel L. K. Yamins
Stanford University
J
Jiajun Wu
Stanford University