Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional monocular depth estimation struggles to capture fine-grained road geometry (e.g., bumps, gradients), leading to inaccuracies in motion planning and vehicle stability control. To address this, we propose Gamma representation: a dimensionless, metric-consistent formulation modeling vertical road surface residuals relative to a dominant planar assumption. Leveraging only the camera’s height above ground as prior knowledge—without full extrinsic calibration—we derive a closed-form near-field depth reconstruction. We design a lightweight self-supervised network (8.88M parameters) that jointly predicts the dominant plane and Gamma residual map. For the first time, we validate self-supervised monocular depth estimation on the RSRD dataset. Our method achieves state-of-the-art performance in near-field depth and Gamma estimation on both KITTI and RSRD, matches prior methods in global depth accuracy, exhibits strong generalization across diverse camera configurations, and requires no annotated data.

Technology Category

Application Category

📝 Abstract
Accurate perception of the vehicle's 3D surroundings, including fine-scale road geometry, such as bumps, slopes, and surface irregularities, is essential for safe and comfortable vehicle control. However, conventional monocular depth estimation often oversmooths these features, losing critical information for motion planning and stability. To address this, we introduce Gamma-from-Mono (GfM), a lightweight monocular geometry estimation method that resolves the projective ambiguity in single-camera reconstruction by decoupling global and local structure. GfM predicts a dominant road surface plane together with residual variations expressed by gamma, a dimensionless measure of vertical deviation from the plane, defined as the ratio of a point's height above it to its depth from the camera, and grounded in established planar parallax geometry. With only the camera's height above ground, this representation deterministically recovers metric depth via a closed form, avoiding full extrinsic calibration and naturally prioritizing near-road detail. Its physically interpretable formulation makes it well suited for self-supervised learning, eliminating the need for large annotated datasets. Evaluated on KITTI and the Road Surface Reconstruction Dataset (RSRD), GfM achieves state-of-the-art near-field accuracy in both depth and gamma estimation while maintaining competitive global depth performance. Our lightweight 8.88M-parameter model adapts robustly across diverse camera setups and, to our knowledge, is the first self-supervised monocular approach evaluated on RSRD.
Problem

Research questions and friction points this paper is trying to address.

Monocular depth estimation oversmooths fine-scale road geometry
GfM resolves projective ambiguity by decoupling global and local structure
It recovers metric depth without full extrinsic calibration or large datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples global and local structure to resolve projective ambiguity
Uses dimensionless gamma measure for vertical deviation from road plane
Self-supervised learning eliminates need for large annotated datasets
🔎 Similar Papers
No similar papers found.