EndoOmni: Zero-Shot Cross-Dataset Depth Estimation in Endoscopy by Robust Self-Learning from Noisy Labels

📅 2024-09-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of high-quality ground-truth depth annotations and severe label noise in real-world endoscopic imagery—leading to poor generalization of depth estimation models—this work proposes the first zero-shot cross-dataset depth estimation method specifically designed for endoscopic video. Methodologically, we (1) construct an endoscopy-specific foundational depth estimation model; (2) design a teacher-confidence-guided robust self-training framework to mitigate annotation noise; and (3) introduce a weighted scale- and translation-invariant loss to adaptively suppress erroneous pixel predictions. Experiments demonstrate that our method achieves a 33% reduction in absolute relative error over the medical-domain state-of-the-art and a 34% improvement over existing general-purpose foundation models on zero-shot relative depth estimation. Moreover, it provides a strong initialization for downstream fine-tuning, consistently outperforming prior approaches across diverse endoscopic domains.

Technology Category

Application Category

📝 Abstract
Single-image depth estimation is essential for endoscopy tasks such as localization, reconstruction, and augmented reality. Most existing methods in surgical scenes focus on in-domain depth estimation, limiting their real-world applicability. This constraint stems from the scarcity and inferior labeling quality of medical data for training. In this work, we present EndoOmni, the first foundation model for zero-shot cross-domain depth estimation for endoscopy. To harness the potential of diverse training data, we refine the advanced self-learning paradigm that employs a teacher model to generate pseudo-labels, guiding a student model trained on large-scale labeled and unlabeled data. To address training disturbance caused by inherent noise in depth labels, we propose a robust training framework that leverages both depth labels and estimated confidence from the teacher model to jointly guide the student model training. Moreover, we propose a weighted scale-and-shift invariant loss to adaptively adjust learning weights based on label confidence, thus imposing learning bias towards cleaner label pixels while reducing the influence of highly noisy pixels. Experiments on zero-shot relative depth estimation show that our EndoOmni improves state-of-the-art methods in medical imaging for 33% and existing foundation models for 34% in terms of absolute relative error on specific datasets. Furthermore, our model provides strong initialization for fine-tuning metric depth estimation, maintaining superior performance in both in-domain and out-of-domain scenarios. The source code is publicly available at https://github.com/TianCuteQY/EndoOmni.
Problem

Research questions and friction points this paper is trying to address.

Depth Estimation
Endoscopic Images
Real-world Adaptability
Innovation

Methods, ideas, or system contributions that make the work stand out.

EndoOmni
Unsupervised Depth Estimation
Teacher-Student Framework
🔎 Similar Papers
No similar papers found.
Qingyao Tian
Qingyao Tian
Ph.D. candidate, Institute of Automation, Chinese Academy of Sciences
AI for healthcaremedical imagingfoundation models
Z
Zhen Chen
Centre of AI and Robotics, Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences, HK, China
H
Huai Liao
The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
X
Xinyan Huang
The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
L
Lujie Li
The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
S
Sébastien Ourselin
School of Engineering and Imaging Sciences, King’s College London, UK
H
Hongbin Liu
Institute of Automation, Chinese Academy of Sciences; Centre of AI and Robotics, Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences; School of Engineering and Imaging Sciences, King’s College London, UK