The Fourth Monocular Depth Estimation Challenge

📅 2025-04-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses zero-shot generalization in monocular depth estimation, targeting the highly challenging SYNS-Patches benchmark—comprising mixed natural and indoor scenes. We propose an affine-invariant depth modeling paradigm and introduce, for the first time in this task, a two-degree-of-freedom least-squares (2-DOF LS) alignment protocol for evaluation, significantly enhancing cross-domain robustness. Leveraging Depth Anything v2 and Marigold as strong off-the-shelf baselines, we systematically assess the zero-shot transfer capability of pretrained models. All 24 participating teams surpassed both baselines; the winning method improved the 3D F-Score from 22.58% to 23.05%. This advancement notably strengthens generalization performance of monocular depth estimation on unseen, complex scenes and promotes standardization of evaluation protocols for zero-shot depth estimation.

Technology Category

Application Category

📝 Abstract
This paper presents the results of the fourth edition of the Monocular Depth Estimation Challenge (MDEC), which focuses on zero-shot generalization to the SYNS-Patches benchmark, a dataset featuring challenging environments in both natural and indoor settings. In this edition, we revised the evaluation protocol to use least-squares alignment with two degrees of freedom to support disparity and affine-invariant predictions. We also revised the baselines and included popular off-the-shelf methods: Depth Anything v2 and Marigold. The challenge received a total of 24 submissions that outperformed the baselines on the test set; 10 of these included a report describing their approach, with most leading methods relying on affine-invariant predictions. The challenge winners improved the 3D F-Score over the previous edition's best result, raising it from 22.58% to 23.05%.
Problem

Research questions and friction points this paper is trying to address.

Zero-shot generalization to SYNS-Patches benchmark
Revised evaluation protocol for disparity predictions
Improved 3D F-Score over previous challenge results
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot generalization to SYNS-Patches benchmark
Least-squares alignment with two degrees freedom
Affine-invariant predictions in leading methods
🔎 Similar Papers
No similar papers found.
Anton Obukhov
Anton Obukhov
Principal Research Scientist, Huawei Research Center Zürich
Computer VisionGenerative AI
Matteo Poggi
Matteo Poggi
Tenure-Track Assistant professor (RTD-B), University of Bologna
Computer VisionSpatial AI
Fabio Tosi
Fabio Tosi
Junior Assistant Professor (RTD-A), Università di Bologna
Computer VisionStereo VisionArtificial intelligenceDeep LearningMachine Learning
R
Ripudaman Singh Arora
Blue River Technology
J
Jaime Spencer
Independent Researcher
Chris Russell
Chris Russell
Associate Professor, University of Oxford
Ethical Machine LearningComputer VisionOptimisationEthical AI
Simon Hadfield
Simon Hadfield
Centre For Vision Speech and Signal Processing, University of Surrey
Computer visionRobot visionAutonomyMachine LearningReinforcement Learning
Richard Bowden
Richard Bowden
Professor of Computer Vision and Machine Learning, CVSSP, University of Surrey
Computer VisionMachine learningArtificial Intelligence
S
Shuaihang Wang
Hikvision Research Institute
Z
Zhenxin Ma
Hikvision Research Institute
W
Weijie Chen
Hikvision Research Institute
B
Baobei Xu
Hikvision Research Institute
F
Fengyu Sun
Hikvision Research Institute
Di Xie
Di Xie
Hikvision Research Institute
J
Jiang Zhu
Hikvision Research Institute
M
M. Lavreniuk
Space Research Institute NASU-SSAU, Kyiv, Ukraine
H
Haining Guan
Megvii
Q
Qun Wu
Megvii
Y
Yupei Zeng
Megvii
C
Chao Lu
Megvii
Huanran Wang
Huanran Wang
Megvii
G
Guangyuan Zhou
Independent Researcher
H
Haotian Zhang
Independent Researcher
J
Jianxiong Wang
Independent Researcher
Qiang Rao
Qiang Rao
Master of Computer Science, University of Chinese Academy of Sciences
Computer VisionArtificial Intelligence
Chunjie Wang
Chunjie Wang
Shenzhen Institutes of Advanced Technology, Chinese academy of Sciences
6GUAVRISISACWireless communication
X
Xiao Liu
Insta360
Z
Zhiqiang Lou
Insta360
Hualie Jiang
Hualie Jiang
Insta360/Antigravity
Computer Vision3D VisionOmnidirectional Vision
Y
Yihao Chen
Insta360
R
Rui Xu
Insta360
M
Minglang Tan
Insta360
Z
Zihan Qin
Harbin Institute of Technology
Y
Yifan Mao
Harbin Institute of Technology
Jiayang Liu
Jiayang Liu
University of Science and Technology of China
Adversarial exampleAI security
J
Jialei Xu
Harbin Institute of Technology
Y
Yifan Yang
Harbin Institute of Technology
W
Wenbo Zhao
Harbin Institute of Technology
Junjun Jiang
Junjun Jiang
Harbin Institute of Technology
Image ProcessingComputer VisionMachine Learning
X
Xianming Liu
Harbin Institute of Technology
M
Mingshuai Zhao
Beijing University of Posts and Telecommunications
A
Anlong Ming
Beijing University of Posts and Telecommunications
Wu Chen
Wu Chen
Beijing University of Posts and Telecommunications
F
Feng Xue
University of Trento
M
Mengying Yu
Beijing University of Posts and Telecommunications
S
Shida Gao
Beijing University of Posts and Telecommunications
X
Xiangfeng Wang
Beijing University of Posts and Telecommunications
Gbenga Omotara
Gbenga Omotara
Graduate Student, University of Missouri
computer visionmachine learningrobotics
Ramy Farag
Ramy Farag
Research Assistant, University of Missouri
Machine learningAI for healthcare
J
Jacket Demby
University of Missouri
S
S. M. A. Tousi
University of Missouri
G
G. N. DeSouza
University of Missouri
Tuan-Anh Yang
Tuan-Anh Yang
VNUHCM-University of Science
Machine LearningComputer VisionPrecision AgricultureMedical Image Analysis
M
Minh-Quang Nguyen
University of Science, VNU-HCM, Vietnam; Vietnam National University, Ho Chi Minh, Vietnam
Thien-Phuc Tran
Thien-Phuc Tran
University of Science, VNU-HCM
Deep LearningComputer VisionMachine Learning3D Computer Vision
A
Albert Luginov
University of Reading
M
Muhammad Shahzad
University of Reading