🤖 AI Summary
To address the geometric noise and poor robustness of 3D Gaussian Splatting (3DGS) in unstructured outdoor scenes, as well as the high computational cost and limited fidelity of NeRF-based methods for synthetic stereo data generation, this paper proposes the first stereo synthesis pipeline integrating explicit 3DGS geometry with depth priors from FoundationStereo. By transferring expert knowledge into stereo matching data generation, our method leverages FoundationStereo’s superior robustness to occlusions and textureless regions to correct 3DGS reconstruction errors, yielding high-fidelity, low-cost synthetic disparity maps. We further introduce a knowledge distillation–inspired fine-tuning strategy that significantly enhances zero-shot generalization. Experiments demonstrate state-of-the-art performance on zero-shot stereo matching benchmarks; the synthetic data effectively substitutes real annotations, improving training efficiency by over 40%; and our approach explicitly identifies and mitigates 3DGS’s inherent robustness limitations in complex outdoor environments.
📝 Abstract
In this paper, we introduce a 3D Gaussian Splatting (3DGS)-based pipeline for stereo dataset generation, offering an efficient alternative to Neural Radiance Fields (NeRF)-based methods. To obtain useful geometry estimates, we explore utilizing the reconstructed geometry from the explicit 3D representations as well as depth estimates from the FoundationStereo model in an expert knowledge transfer setup. We find that when fine-tuning stereo models on 3DGS-generated datasets, we demonstrate competitive performance in zero-shot generalization benchmarks. When using the reconstructed geometry directly, we observe that it is often noisy and contains artifacts, which propagate noise to the trained model. In contrast, we find that the disparity estimates from FoundationStereo are cleaner and consequently result in a better performance on the zero-shot generalization benchmarks. Our method highlights the potential for low-cost, high-fidelity dataset creation and fast fine-tuning for deep stereo models. Moreover, we also reveal that while the latest Gaussian Splatting based methods have achieved superior performance on established benchmarks, their robustness falls short in challenging in-the-wild settings warranting further exploration.