LEGO-Motion: Learning-Enhanced Grids with Occupancy Instance Modeling for Class-Agnostic Motion Prediction

📅 2025-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing motion forecasting methods for autonomous driving suffer from two key limitations: object-level models exhibit poor generalization and low geometric accuracy, while occupancy-based approaches—though class-agnostic—lack physical consistency and explicit interaction modeling. To address these issues, we propose the first occupancy-instance joint modeling framework, which jointly encodes scene occupancy and traffic agent instances in the bird’s-eye view (BEV) space. Our method explicitly incorporates kinematic constraints and agent–agent interactions through a dual-branch architecture comprising a BEV encoder, an interaction-enhanced instance encoder, and an instance-enhanced BEV encoder. It supports both FMCW LiDAR and nuScenes multimodal inputs. Evaluated on nuScenes, our approach achieves state-of-the-art performance. Furthermore, benchmarking on an FMCW LiDAR dataset demonstrates strong generalization capability and practical deployment potential.

Technology Category

Application Category

📝 Abstract
Accurate and reliable spatial and motion information plays a pivotal role in autonomous driving systems. However, object-level perception models struggle with handling open scenario categories and lack precise intrinsic geometry. On the other hand, occupancy-based class-agnostic methods excel in representing scenes but fail to ensure physics consistency and ignore the importance of interactions between traffic participants, hindering the model's ability to learn accurate and reliable motion. In this paper, we introduce a novel occupancy-instance modeling framework for class-agnostic motion prediction tasks, named LEGO-Motion, which incorporates instance features into Bird's Eye View (BEV) space. Our model comprises (1) a BEV encoder, (2) an Interaction-Augmented Instance Encoder, and (3) an Instance-Enhanced BEV Encoder, improving both interaction relationships and physics consistency within the model, thereby ensuring a more accurate and robust understanding of the environment. Extensive experiments on the nuScenes dataset demonstrate that our method achieves state-of-the-art performance, outperforming existing approaches. Furthermore, the effectiveness of our framework is validated on the advanced FMCW LiDAR benchmark, showcasing its practical applicability and generalization capabilities. The code will be made publicly available to facilitate further research.
Problem

Research questions and friction points this paper is trying to address.

Improves motion prediction in autonomous driving systems
Enhances physics consistency and interaction modeling
Addresses limitations of class-agnostic occupancy-based methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Occupancy-instance modeling for motion prediction
BEV encoder with interaction-augmented instance features
Improved physics consistency and interaction relationships
🔎 Similar Papers
No similar papers found.
K
Kangan Qian
School of Vehicle and Mobility, Tsinghua University, Beijing, China
J
Jinyu Miao
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Ziang Luo
Ziang Luo
Tsinghua University
Autonomous driving
Zheng Fu
Zheng Fu
Tsinghua university
J
Jinchen Li
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Y
Yining Shi
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Y
Yunlong Wang
AD Division of NIO Inc., China
Kun Jiang
Kun Jiang
Tsinghua University
autonomous driving
M
Mengmeng Yang
School of Vehicle and Mobility, Tsinghua University, Beijing, China
D
Diange Yang
School of Vehicle and Mobility, Tsinghua University, Beijing, China