PS4PRO: Pixel-to-pixel Supervision for Photorealistic Rendering and Optimization

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Neural rendering often suffers from geometric and appearance distortions when reconstructing complex dynamic scenes from sparse viewpoints due to insufficient angular coverage. To address this, we propose PS4PRO—a lightweight yet high-fidelity implicit neural rendering framework that pioneers video frame interpolation as a geometry-aware data augmentation strategy. PS4PRO jointly models camera motion and true 3D geometry to establish an implicit world prior. Its architecture integrates a pixel-wise supervised interpolation network, multi-view photometric consistency optimization, implicit neural scene representation, and inter-frame geometric constraints. Evaluated on multiple benchmarks, PS4PRO achieves state-of-the-art performance in both static and dynamic scene reconstruction, significantly improving PSNR and SSIM while mitigating the generalization degradation caused by viewpoint sparsity.

Technology Category

Application Category

📝 Abstract

Neural rendering methods have gained significant attention for their ability to reconstruct 3D scenes from 2D images. The core idea is to take multiple views as input and optimize the reconstructed scene by minimizing the uncertainty in geometry and appearance across the views. However, the reconstruction quality is limited by the number of input views. This limitation is further pronounced in complex and dynamic scenes, where certain angles of objects are never seen. In this paper, we propose to use video frame interpolation as the data augmentation method for neural rendering. Furthermore, we design a lightweight yet high-quality video frame interpolation model, PS4PRO (Pixel-to-pixel Supervision for Photorealistic Rendering and Optimization). PS4PRO is trained on diverse video datasets, implicitly modeling camera movement as well as real-world 3D geometry. Our model performs as an implicit world prior, enriching the photo supervision for 3D reconstruction. By leveraging the proposed method, we effectively augment existing datasets for neural rendering methods. Our experimental results indicate that our method improves the reconstruction performance on both static and dynamic scenes.

Problem

Research questions and friction points this paper is trying to address.

Enhances 3D scene reconstruction from limited 2D views

Improves neural rendering for complex and dynamic scenes

Uses video interpolation to augment data for better rendering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Video frame interpolation for data augmentation

Lightweight high-quality PS4PRO model

Implicit world prior for 3D reconstruction

🔎 Similar Papers

Expansive Supervision for Neural Radiance Field