PhysInOne: Visual Physics Learning and Reasoning in One Suite

πŸ“… 2026-04-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the scarcity of large-scale, physically realistic training data that currently limits visual physics learning in AI systems. To bridge this gap, the authors introduce a large-scale synthetic dataset comprising two million videos spanning 71 fundamental physical phenomena across mechanics, optics, fluid dynamics, and magnetism. The dataset provides rich annotations including 3D geometry, dynamics, object attributes, and textual descriptions, and uniquely supports physical reasoning tasks involving multi-object interactions and complex backgrounds. Experimental results demonstrate that models trained on this dataset achieve significantly improved physical plausibility in video generation, future frame prediction, material property estimation, and motion transfer. Furthermore, the study reveals persistent limitations in current models’ ability to capture complex dynamical behaviors.

Technology Category

Application Category

πŸ“ Abstract
We present PhysInOne, a large-scale synthetic dataset addressing the critical scarcity of physically-grounded training data for AI systems. Unlike existing datasets limited to merely hundreds or thousands of examples, PhysInOne provides 2 million videos across 153,810 dynamic 3D scenes, covering 71 basic physical phenomena in mechanics, optics, fluid dynamics, and magnetism. Distinct from previous works, our scenes feature multiobject interactions against complex backgrounds, with comprehensive ground-truth annotations including 3D geometry, semantics, dynamic motion, physical properties, and text descriptions. We demonstrate PhysInOne's efficacy across four emerging applications: physics-aware video generation, long-/short-term future frame prediction, physical property estimation, and motion transfer. Experiments show that fine-tuning foundation models on PhysInOne significantly enhances physical plausibility, while also exposing critical gaps in modeling complex physical dynamics and estimating intrinsic properties. As the largest dataset of its kind, orders of magnitude beyond prior works, PhysInOne establishes a new benchmark for advancing physics-grounded world models in generation, simulation, and embodied AI.
Problem

Research questions and friction points this paper is trying to address.

physically-grounded data
visual physics learning
physics reasoning
training data scarcity
physical plausibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic dataset
physics-grounded learning
multi-object interaction
3D scene understanding
physical property estimation
πŸ”Ž Similar Papers
No similar papers found.