Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation

📅 2024-11-27
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Omnidirectional stereo depth estimation has long suffered from a scarcity of real-world training data, severely limiting model generalization. To address this, we introduce Helvipad—the first large-scale, real-scene omnidirectional stereo depth dataset—comprising 40K spherical video frames with LiDAR-calibrated dense depth and disparity ground truth. To tackle geometric distortions inherent in spherical imagery and the challenge of sparse depth supervision, we propose a spherical stereo adaptation architecture coupled with a depth completion enhancement strategy. Our approach leverages dual co-located 360° cameras (top-bottom configuration), equirectangular projection, and LiDAR point cloud projection, and further improves omnidirectional stereo matching via architectural refinements. Benchmarking reveals that existing methods degrade significantly on omnidirectional scenes; our adapted framework achieves substantial accuracy gains. Helvipad establishes the first standardized evaluation benchmark and reproducible technical pipeline for omnidirectional stereo depth estimation.

Technology Category

Application Category

📝 Abstract
Despite considerable progress in stereo depth estimation, omnidirectional imaging remains underexplored, mainly due to the lack of appropriate data. We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation, consisting of 40K frames from video sequences across diverse environments, including crowded indoor and outdoor scenes with diverse lighting conditions. Collected using two 360{deg} cameras in a top-bottom setup and a LiDAR sensor, the dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. Additionally, we provide an augmented training set with a significantly increased label density by using depth completion. We benchmark leading stereo depth estimation models for both standard and omnidirectional images. The results show that while recent stereo methods perform decently, a significant challenge persists in accurately estimating depth in omnidirectional imaging. To address this, we introduce necessary adaptations to stereo models, achieving improved performance.
Problem

Research questions and friction points this paper is trying to address.

Lack of real-world omnidirectional stereo depth datasets
Challenges in accurate depth estimation for 360° images
Need for model adaptations to improve omnidirectional performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-world omnidirectional stereo dataset with LiDAR
Depth completion for augmented training set
Adapted stereo models for omnidirectional imaging
🔎 Similar Papers
No similar papers found.
M
Mehdi Zayene
École Polytechnique Fédérale de Lausanne, Switzerland
Jannik Endres
Jannik Endres
Unknown affiliation
Computer VisionMachine LearningRobotics
A
Albias Havolli
École Polytechnique Fédérale de Lausanne, Switzerland
Charles Corbière
Charles Corbière
Senior ML Researcher, Raidium
deep learningcomputer visionmedical imagingAI safety
S
Salim Cherkaoui
École Polytechnique Fédérale de Lausanne, Switzerland
A
Alexandre Kontouli
École Polytechnique Fédérale de Lausanne, Switzerland
Alexandre Alahi
Alexandre Alahi
Professor, EPFL
Computer VisionTransportationAutonomous drivingIntelligent Transportation SystemsAI