A Wavelet-based Stereo Matching Framework for Solving Frequency Convergence Inconsistency

πŸ“… 2025-05-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
RAFT-Stereo suffers from inconsistent convergence across frequency bands during iterative optimization due to uniform full-spectrum updates, leading to severe degradation of high-frequency detailsβ€”such as object boundaries and fine structures. To address this, we propose a wavelet-decomposition-based dual-path stereo matching framework that explicitly decouples frequency components for the first time: a low-frequency branch captures global scene structure, while a high-frequency branch employs an LSTM-driven adaptive update operator to dynamically preserve and refine fine details. An iterative frequency adapter is further introduced to harmonize optimization between the two paths. Our approach breaks the limitations of conventional iterative paradigms, achieving state-of-the-art performance on multiple metrics of the KITTI 2015 and KITTI 2012 benchmarks. It notably enhances texture recovery in distant regions and significantly improves reconstruction accuracy of fine-grained structures.

Technology Category

Application Category

πŸ“ Abstract
We find that the EPE evaluation metrics of RAFT-stereo converge inconsistently in the low and high frequency regions, resulting high frequency degradation (e.g., edges and thin objects) during the iterative process. The underlying reason for the limited performance of current iterative methods is that it optimizes all frequency components together without distinguishing between high and low frequencies. We propose a wavelet-based stereo matching framework (Wavelet-Stereo) for solving frequency convergence inconsistency. Specifically, we first explicitly decompose an image into high and low frequency components using discrete wavelet transform. Then, the high-frequency and low-frequency components are fed into two different multi-scale frequency feature extractors. Finally, we propose a novel LSTM-based high-frequency preservation update operator containing an iterative frequency adapter to provide adaptive refined high-frequency features at different iteration steps by fine-tuning the initial high-frequency features. By processing high and low frequency components separately, our framework can simultaneously refine high-frequency information in edges and low-frequency information in smooth regions, which is especially suitable for challenging scenes with fine details and textures in the distance. Extensive experiments demonstrate that our Wavelet-Stereo outperforms the state-of-the-art methods and ranks 1st on both the KITTI 2015 and KITTI 2012 leaderboards for almost all metrics. We will provide code and pre-trained models to encourage further exploration, application, and development of our innovative framework (https://github.com/SIA-IDE/Wavelet-Stereo).
Problem

Research questions and friction points this paper is trying to address.

Addresses inconsistent EPE convergence in high/low frequency regions
Separates high/low frequency processing to preserve edge details
Improves stereo matching accuracy for fine textures in distant scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet-based stereo matching framework
Discrete wavelet transform decomposition
LSTM-based high-frequency preservation operator
πŸ”Ž Similar Papers
No similar papers found.
Xiaobao Wei
Xiaobao Wei
Institute of Software, Chinese Academy of Sciences
3D Vision
J
Jiawei Liu
Shenyang Institute of Automation, Chinese Academy of Sciences; Liaoning Liaohe Laboratory; Key Laboratory on Intelligent Detection and Equipment Technology of Liaoning Province
D
Dongbo Yang
Shenyang Institute of Automation, Chinese Academy of Sciences; Liaoning Liaohe Laboratory; Key Laboratory on Intelligent Detection and Equipment Technology of Liaoning Province; University of Chinese Academy of Sciences
Junda Cheng
Junda Cheng
Huazhong University of Science and Technology
computer vision
C
Changyong Shu
Beihang University
W
Wei Wang
Shenyang Institute of Automation, Chinese Academy of Sciences; Liaoning Liaohe Laboratory; Key Laboratory on Intelligent Detection and Equipment Technology of Liaoning Province