🤖 AI Summary
Dynamic distractors in real-world scenes disrupt multi-view consistency, leading to geometric distortions in 3D Gaussian splatting reconstruction. To address this, we propose DeSplat: a purely rendering-driven Gaussian decomposition method that requires no pre-trained semantic models. Its core innovation lies in the first end-to-end differentiable volumetric rendering framework capable of automatically disentangling static scene geometry from view-dependent distractors—achieved via decomposed Gaussian splatting, view-aligned initialization, and dual-stream alpha compositing. This yields an explicit, interpretable, and semantically decoupled scene representation. Evaluated on three major benchmarks, DeSplat matches the reconstruction accuracy of state-of-the-art distractor-free methods while maintaining real-time rendering performance. Moreover, it significantly reduces computational overhead and eliminates reliance on costly manual annotations or external semantic priors.
📝 Abstract
Gaussian splatting enables fast novel view synthesis in static 3D environments. However, reconstructing real-world environments remains challenging as distractors or occluders break the multi-view consistency assumption required for accurate 3D reconstruction. Most existing methods rely on external semantic information from pre-trained models, introducing additional computational overhead as pre-processing steps or during optimization. In this work, we propose a novel method, DeSplat, that directly separates distractors and static scene elements purely based on volume rendering of Gaussian primitives. We initialize Gaussians within each camera view for reconstructing the view-specific distractors to separately model the static 3D scene and distractors in the alpha compositing stages. DeSplat yields an explicit scene separation of static elements and distractors, achieving comparable results to prior distractor-free approaches without sacrificing rendering speed. We demonstrate DeSplat's effectiveness on three benchmark data sets for distractor-free novel view synthesis. See the project website at https://aaltoml.github.io/desplat/.