EVA-Gaussian: 3D Gaussian-based Real-time Human Novel View Synthesis under Diverse Camera Settings

📅 2024-10-02

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing 3D Gaussian splatting methods suffer from degraded novel-view synthesis performance and limited fine-grained detail reconstruction under sparse, wide-baseline camera configurations. To address this, we propose an optimization framework tailored for real-time human rendering. Our key contributions are: (1) an Efficient Cross-View Attention (EVA) module that enables precise 3D Gaussian spatial localization under sparse-view conditions; (2) a recurrent feature refiner that iteratively corrects geometric distortions; and (3) a multi-scale anchor loss jointly enforcing consistency between Gaussian attributes and facial landmark geometry. Evaluated on THuman2.0 and THumansit, our method significantly improves rendering quality under large viewpoint disparities, achieves real-time inference (≥30 FPS), and sets new state-of-the-art performance in both visual fidelity and geometric consistency.

Technology Category

Application Category

📝 Abstract

The feed-forward based 3D Gaussian Splatting method has demonstrated exceptional capability in real-time human novel view synthesis. However, existing approaches are restricted to dense viewpoint settings, which limits their flexibility in free-viewpoint rendering across a wide range of camera view angle discrepancies. To address this limitation, we propose a real-time pipeline named EVA-Gaussian for 3D human novel view synthesis across diverse camera settings. Specifically, we first introduce an Efficient cross-View Attention (EVA) module to accurately estimate the position of each 3D Gaussian from the source images. Then, we integrate the source images with the estimated Gaussian position map to predict the attributes and feature embeddings of the 3D Gaussians. Moreover, we employ a recurrent feature refiner to correct artifacts caused by geometric errors in position estimation and enhance visual fidelity.To further improve synthesis quality, we incorporate a powerful anchor loss function for both 3D Gaussian attributes and human face landmarks. Experimental results on the THuman2.0 and THumansit datasets showcase the superiority of our EVA-Gaussian approach in rendering quality across diverse camera settings. Project page: https://zhenliuzju.github.io/huyingdong/EVA-Gaussian.

Problem

Research questions and friction points this paper is trying to address.

Real-time human novel view synthesis under diverse multi-view camera settings.

Overcoming limitations in dense viewpoint configurations and image resolutions.

Enhancing visual fidelity and detail recovery in 3D human models.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient Cross-View Attention module

Feature refinement for 3D Gaussians

Real-time synthesis across diverse views

🔎 Similar Papers

No similar papers found.

Authors to Follow