PhysGM: Large Physical Gaussian Model for Feed-Forward 4D Synthesis

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing physics-based 3D motion synthesis methods rely on pre-reconstructed 3D Gaussian splatting representations and suffer from physically inaccurate modeling due to manually specified parameters or unstable optimization guided by video priors. Method: We propose the first end-to-end feed-forward framework that jointly predicts a 3D Gaussian representation and probabilistic physical attributes (e.g., mass, friction, elasticity) directly from a single image, enabling real-time physics simulation and high-fidelity 4D rendering. To mitigate instability in Score Distillation Sampling (SDS) gradients, we introduce Direct Preference Optimization (DPO) for alignment with reference videos and curate PhysAssets—a large-scale dataset of 24K physically consistent 3D assets. Contribution/Results: Our method generates high-quality 4D dynamic scenes in approximately one minute—significantly faster than prior approaches—while preserving physical plausibility and visual realism.

Technology Category

Application Category

📝 Abstract
While physics-grounded 3D motion synthesis has seen significant progress, current methods face critical limitations. They typically rely on pre-reconstructed 3D Gaussian Splatting (3DGS) representations, while physics integration depends on either inflexible, manually defined physical attributes or unstable, optimization-heavy guidance from video models. To overcome these challenges, we introduce PhysGM, a feed-forward framework that jointly predicts a 3D Gaussian representation and its physical properties from a single image, enabling immediate, physical simulation and high-fidelity 4D rendering. We first establish a base model by jointly optimizing for Gaussian reconstruction and probabilistic physics prediction. The model is then refined with physically plausible reference videos to enhance both rendering fidelity and physics prediction accuracy. We adopt the Direct Preference Optimization (DPO) to align its simulations with reference videos, circumventing Score Distillation Sampling (SDS) optimization which needs back-propagating gradients through the complex differentiable simulation and rasterization. To facilitate the training, we introduce a new dataset PhysAssets of over 24,000 3D assets, annotated with physical properties and corresponding guiding videos. Experimental results demonstrate that our method effectively generates high-fidelity 4D simulations from a single image in one minute. This represents a significant speedup over prior works while delivering realistic rendering results. Our project page is at:https://hihixiaolv.github.io/PhysGM.github.io/
Problem

Research questions and friction points this paper is trying to address.

Overcoming reliance on pre-reconstructed 3D representations for motion synthesis
Eliminating inflexible manual physical attributes or unstable optimization guidance
Enabling immediate physical simulation from single image without complex optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Jointly predicts 3D Gaussians and physical properties
Uses Direct Preference Optimization for simulation alignment
Creates feed-forward framework from single image input
🔎 Similar Papers
No similar papers found.
C
Chunji Lv
Beijing Institute of Technology
Z
Zequn Chen
Li Auto Inc.
Donglin Di
Donglin Di
Li Auto Inc.
Generative ModelsEmbodied AIMedical ImageMultimedia
W
Weinan Zhang
Harbin Institute of Technology
H
Hao Li
Li Auto Inc.
W
Wei Chen
Li Auto Inc.
Changsheng Li
Changsheng Li
Beijing Institute of Technology
Flexible roboticsMechanical DesignRoboticsMedical RoboticsSurgical Robotics