EVA-Gaussian: 3D Gaussian-based Real-time Human Novel View Synthesis under Diverse Camera Settings

📅 2024-10-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D Gaussian splatting methods suffer from degraded novel-view synthesis performance and limited fine-grained detail reconstruction under sparse, wide-baseline camera configurations. To address this, we propose an optimization framework tailored for real-time human rendering. Our key contributions are: (1) an Efficient Cross-View Attention (EVA) module that enables precise 3D Gaussian spatial localization under sparse-view conditions; (2) a recurrent feature refiner that iteratively corrects geometric distortions; and (3) a multi-scale anchor loss jointly enforcing consistency between Gaussian attributes and facial landmark geometry. Evaluated on THuman2.0 and THumansit, our method significantly improves rendering quality under large viewpoint disparities, achieves real-time inference (≥30 FPS), and sets new state-of-the-art performance in both visual fidelity and geometric consistency.

Technology Category

Application Category

📝 Abstract
The feed-forward based 3D Gaussian Splatting method has demonstrated exceptional capability in real-time human novel view synthesis. However, existing approaches are restricted to dense viewpoint settings, which limits their flexibility in free-viewpoint rendering across a wide range of camera view angle discrepancies. To address this limitation, we propose a real-time pipeline named EVA-Gaussian for 3D human novel view synthesis across diverse camera settings. Specifically, we first introduce an Efficient cross-View Attention (EVA) module to accurately estimate the position of each 3D Gaussian from the source images. Then, we integrate the source images with the estimated Gaussian position map to predict the attributes and feature embeddings of the 3D Gaussians. Moreover, we employ a recurrent feature refiner to correct artifacts caused by geometric errors in position estimation and enhance visual fidelity.To further improve synthesis quality, we incorporate a powerful anchor loss function for both 3D Gaussian attributes and human face landmarks. Experimental results on the THuman2.0 and THumansit datasets showcase the superiority of our EVA-Gaussian approach in rendering quality across diverse camera settings. Project page: https://zhenliuzju.github.io/huyingdong/EVA-Gaussian.
Problem

Research questions and friction points this paper is trying to address.

Real-time human novel view synthesis under diverse multi-view camera settings.
Overcoming limitations in dense viewpoint configurations and image resolutions.
Enhancing visual fidelity and detail recovery in 3D human models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient Cross-View Attention module
Feature refinement for 3D Gaussians
Real-time synthesis across diverse views
🔎 Similar Papers
No similar papers found.
Yingdong Hu
Yingdong Hu
Institute for Interdisciplinary Information Sciences, Tsinghua University
computer visionrobotics
Z
Zhening Liu
The Hong Kong University of Science and Technology
J
Jiawei Shao
Institute of Artificial Intelligence (TeleAI), China Telecom
Zehong Lin
Zehong Lin
Research Assistant Professor, Hong Kong University of Science and Technology
Edge AIMachine Learning
J
Jun Zhang
The Hong Kong University of Science and Technology