PhysGraph: A Physics-aware 3D Scene Graph for Perception and Reasoning

📅 2026-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches to 3D scene understanding are largely confined to semantic information, often neglecting the modeling of physical properties and articulated structures, and exhibit limited generalization. This work proposes the first unified framework that integrates symbolic reasoning with structured 3D geometry to reconstruct object-centric 3D representations from RGB-D observations. The method associates object instances across viewpoints, decomposes them into functional parts, and jointly infers material properties and articulation parameters, thereby constructing a scene graph that is both semantically meaningful and physically consistent. Evaluated on both synthetic and real-world datasets, the approach achieves state-of-the-art performance in semantic segmentation, multi-object centroid estimation, and joint articulation prediction. Furthermore, it demonstrates successful application in constraint-aware 3D affordance prediction and real-to-sim transfer tasks.
📝 Abstract
To perform a wide range of daily tasks, robots need to construct a 3D representation that is semantically rich, physically grounded, and structured enough to support task planning and affordance prediction. However, existing approaches primarily focus on semantic retrieval, often overlooking physical and kinematic factors. Methods that attempt to model physical properties typically rely on narrow training sets or single-object modeling, limiting scalability and generalization across diverse object types. To address these challenges, we present PhysGraph, a framework that unifies symbolic reasoning with structured 3D geometry to model kinematic and physical properties in cluttered scenes. Given RGB-D observations, PhysGraph reconstructs object-centric 3D geometry and associates object instances across views. It then decomposes objects into functional parts and infers materials and articulations through visual reasoning. Evaluated on both synthetic and real-world datasets, PhysGraph achieves state-of-the-art results in semantic segmentation, multi-object mass estimation, and articulation prediction. With its simple yet effective design, PhysGraph produces physically consistent and semantically structured scene graphs, serving as a structured 3D representation for downstream tasks such as constraint-aware 3D affordance prediction and real-to-sim transfer, both of which are demonstrated in our experiments.
Problem

Research questions and friction points this paper is trying to address.

3D scene representation
physical reasoning
kinematic modeling
semantic segmentation
affordance prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-aware
3D Scene Graph
Articulation Prediction
Object-centric Representation
Real-to-Sim Transfer
🔎 Similar Papers
No similar papers found.