LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training

📅 2024-12-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address erroneous segmentation in vision-only video panoptic segmentation (VPS) for autonomous driving—caused by insufficient temporal modeling and lack of geometric priors—this work pioneers the integration of LiDAR point clouds with camera images to enhance VPS. We propose the first LiDAR-image feature fusion framework that improves performance without requiring video-level training. Methodologically, we design a cross-modal feature alignment and fusion module, introduce a pseudo-temporal propagation mechanism, and structurally adapt a single-frame model for video inference. These two lightweight modifications significantly improve temporal consistency and segmentation accuracy—even under zero video annotation. Experiments on standard VPS benchmarks demonstrate up to +5.0 AP gains in both image- and video-level panoptic segmentation metrics, validating the effectiveness of 3D geometric information in boosting 2D video-level semantic and instance joint segmentation.

Technology Category

Application Category

📝 Abstract
Panoptic segmentation, which combines instance and semantic segmentation, has gained a lot of attention in autonomous vehicles, due to its comprehensive representation of the scene. This task can be applied for cameras and LiDAR sensors, but there has been a limited focus on combining both sensors to enhance image panoptic segmentation (PS). Although previous research has acknowledged the benefit of 3D data on camera-based scene perception, no specific study has explored the influence of 3D data on image and video panoptic segmentation (VPS).This work seeks to introduce a feature fusion module that enhances PS and VPS by fusing LiDAR and image data for autonomous vehicles. We also illustrate that, in addition to this fusion, our proposed model, which utilizes two simple modifications, can further deliver even more high-quality VPS without being trained on video data. The results demonstrate a substantial improvement in both the image and video panoptic segmentation evaluation metrics by up to 5 points.
Problem

Research questions and friction points this paper is trying to address.

LiDAR-camera fusion
panoramic segmentation
autonomous driving
Innovation

Methods, ideas, or system contributions that make the work stand out.

LiDAR-camera fusion
video panoptic segmentation
3D data enhancement
🔎 Similar Papers
No similar papers found.