Omnidirectional Depth-Aided Occupancy Prediction based on Cylindrical Voxel for Autonomous Driving

๐Ÿ“… 2025-03-26
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address geometric ambiguity arising from insufficient geometric priors in autonomous driving 3D perception and the scarcity of annotated fisheye surround-view datasets, this paper proposes OmniDepth-Occ. First, it introduces panoramic depth estimation as a geometric prior. Second, it designs a cylindrical voxel representation tailored to fisheye radial distortion, performing voxelization in polar coordinates. Third, it establishes a Sketch-Coloring two-stage occupancy prediction paradigm: depth-guided coarse-grained sketch generation followed by fine-grained coloring-based reconstruction. To support training and evaluation, we construct a large-scale synthetic fisheye multi-camera semantic occupancy datasetโ€”twice the size of SemanticKITTI. Experiments demonstrate significant improvements in 3D occupancy prediction accuracy on both our synthetic benchmark and standard real-world benchmarks, effectively mitigating geometric ambiguity and enhancing perceptual robustness in surround-view scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Accurate 3D perception is essential for autonomous driving. Traditional methods often struggle with geometric ambiguity due to a lack of geometric prior. To address these challenges, we use omnidirectional depth estimation to introduce geometric prior. Based on the depth information, we propose a Sketch-Coloring framework OmniDepth-Occ. Additionally, our approach introduces a cylindrical voxel representation based on polar coordinate to better align with the radial nature of panoramic camera views. To address the lack of fisheye camera dataset in autonomous driving tasks, we also build a virtual scene dataset with six fisheye cameras, and the data volume has reached twice that of SemanticKITTI. Experimental results demonstrate that our Sketch-Coloring network significantly enhances 3D perception performance.
Problem

Research questions and friction points this paper is trying to address.

Addresses geometric ambiguity in 3D perception for autonomous driving
Proposes cylindrical voxel representation for panoramic camera alignment
Introduces virtual fisheye dataset to fill data gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

Omnidirectional depth estimation for geometric prior
Cylindrical voxel representation for panoramic views
Virtual fisheye camera dataset for autonomous driving
๐Ÿ”Ž Similar Papers
No similar papers found.
C
Chaofan Wu
J
Jiaheng Li
J
Jinghao Cao
M
Ming Li
Y
Yongkang Feng
J
Jiayu Wu Shuwen Xu
Z
Zihang Gao
Sidan Du
Sidan Du
Nanjing University
Image Processing and ControlMachine Learning
Y
Yang Li