Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation

๐Ÿ“… 2025-06-05
๐Ÿ›๏ธ International Conference on Image and Graphics
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Conventional SSIM loss in unsupervised monocular depth estimation suffers from gradient instability and poor robustness due to multiplicative coupling among luminance, contrast, and structure components. Method: We propose a novel additive SSIM loss that replaces the multiplicative combination with an additive oneโ€”first of its kindโ€”and systematically optimize component weights to better model challenging regions such as textureless areas and motion boundaries. Integrated into the MonoDepth framework, our method is trained end-to-end with photometric consistency constraints, and hyperparameter design is guided by parameter sensitivity analysis. Results: On KITTI-2015, our approach significantly outperforms the baseline, reducing absolute relative error (AbsRel) by 8.2%. Notably, improvements are most pronounced in low-texture and dynamic-edge regions, demonstrating enhanced generalization and robustness under challenging conditions.

Technology Category

Application Category

๐Ÿ“ Abstract
Unsupervised monocular depth learning generally relies on the photometric relation among temporally adjacent images. Most of previous works use both mean absolute error (MAE) and structure similarity index measure (SSIM) with conventional form as training loss. However, they ignore the effect of different components in the SSIM function and the corresponding hyperparameters on the training. To address these issues, this work proposes a new form of SSIM. Compared with original SSIM function, the proposed new form uses addition rather than multiplication to combine the luminance, contrast, and structural similarity related components in SSIM. The loss function constructed with this scheme helps result in smoother gradients and achieve higher performance on unsupervised depth estimation. We conduct extensive experiments to determine the relatively optimal combination of parameters for our new SSIM. Based on the popular MonoDepth approach, the optimized SSIM loss function can remarkably outperform the baseline on the KITTI-2015 outdoor dataset.
Problem

Research questions and friction points this paper is trying to address.

Improving SSIM loss for unsupervised monocular depth estimation
Addressing component and hyperparameter effects in SSIM function
Enhancing depth estimation performance with optimized SSIM loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

New SSIM form with additive component combination
Smoother gradients for better depth estimation
Optimized parameters for higher performance
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yijun Cao
The MOE Key Laboratory for Neuroinformation, the School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
Fuya Luo
Fuya Luo
University of Electronic Science and Technology of China
Scene understandingBio-inspired computer visionInfrared image colorization
Y
Yong-Jie Li
The MOE Key Laboratory for Neuroinformation, the School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.