ProSplat: Improved Feed-Forward 3D Gaussian Splatting for Wide-Baseline Sparse Views

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address texture degradation and geometric inconsistency in feed-forward 3D Gaussian Splatting (3DGS) for novel view synthesis (NVS) under wide-baseline sparse-view settings, this paper proposes a two-stage feed-forward framework: an initial stage generates coarse 3D Gaussian primitives, while a subsequent stage employs a single-step diffusion-enhanced model to refine rendering quality. We introduce two key innovations: (1) Maximum Overlap Reference View Injection (MORI), which explicitly encodes multi-view geometric constraints; and (2) Distance-Weighted Epipolar Attention (DWEA), which fuses cross-view features guided by epipolar geometry. Furthermore, a divide-and-conquer joint optimization strategy ensures distribution alignment during training. Evaluated on the RealEstate10K and DL3DV-10K wide-baseline benchmarks, our method achieves an average PSNR gain of 1.0 dB over state-of-the-art approaches, demonstrating significant improvements in both geometric fidelity and texture coherence.

Technology Category

Application Category

📝 Abstract

Feed-forward 3D Gaussian Splatting (3DGS) has recently demonstrated promising results for novel view synthesis (NVS) from sparse input views, particularly under narrow-baseline conditions. However, its performance significantly degrades in wide-baseline scenarios due to limited texture details and geometric inconsistencies across views. To address these challenges, in this paper, we propose ProSplat, a two-stage feed-forward framework designed for high-fidelity rendering under wide-baseline conditions. The first stage involves generating 3D Gaussian primitives via a 3DGS generator. In the second stage, rendered views from these primitives are enhanced through an improvement model. Specifically, this improvement model is based on a one-step diffusion model, further optimized by our proposed Maximum Overlap Reference view Injection (MORI) and Distance-Weighted Epipolar Attention (DWEA). MORI supplements missing texture and color by strategically selecting a reference view with maximum viewpoint overlap, while DWEA enforces geometric consistency using epipolar constraints. Additionally, we introduce a divide-and-conquer training strategy that aligns data distributions between the two stages through joint optimization. We evaluate ProSplat on the RealEstate10K and DL3DV-10K datasets under wide-baseline settings. Experimental results demonstrate that ProSplat achieves an average improvement of 1 dB in PSNR compared to recent SOTA methods.

Problem

Research questions and friction points this paper is trying to address.

Improves 3D Gaussian Splatting for sparse wide-baseline views

Enhances texture and geometric consistency via diffusion model

Boosts rendering fidelity in wide-baseline novel view synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage feed-forward framework for wide-baseline rendering

One-step diffusion model with MORI and DWEA enhancements

Divide-and-conquer training strategy for joint optimization

🔎 Similar Papers

No similar papers found.

Authors to Follow