PS4PRO: Pixel-to-pixel Supervision for Photorealistic Rendering and Optimization

πŸ“… 2025-05-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Neural rendering often suffers from geometric and appearance distortions when reconstructing complex dynamic scenes from sparse viewpoints due to insufficient angular coverage. To address this, we propose PS4PROβ€”a lightweight yet high-fidelity implicit neural rendering framework that pioneers video frame interpolation as a geometry-aware data augmentation strategy. PS4PRO jointly models camera motion and true 3D geometry to establish an implicit world prior. Its architecture integrates a pixel-wise supervised interpolation network, multi-view photometric consistency optimization, implicit neural scene representation, and inter-frame geometric constraints. Evaluated on multiple benchmarks, PS4PRO achieves state-of-the-art performance in both static and dynamic scene reconstruction, significantly improving PSNR and SSIM while mitigating the generalization degradation caused by viewpoint sparsity.

Technology Category

Application Category

πŸ“ Abstract
Neural rendering methods have gained significant attention for their ability to reconstruct 3D scenes from 2D images. The core idea is to take multiple views as input and optimize the reconstructed scene by minimizing the uncertainty in geometry and appearance across the views. However, the reconstruction quality is limited by the number of input views. This limitation is further pronounced in complex and dynamic scenes, where certain angles of objects are never seen. In this paper, we propose to use video frame interpolation as the data augmentation method for neural rendering. Furthermore, we design a lightweight yet high-quality video frame interpolation model, PS4PRO (Pixel-to-pixel Supervision for Photorealistic Rendering and Optimization). PS4PRO is trained on diverse video datasets, implicitly modeling camera movement as well as real-world 3D geometry. Our model performs as an implicit world prior, enriching the photo supervision for 3D reconstruction. By leveraging the proposed method, we effectively augment existing datasets for neural rendering methods. Our experimental results indicate that our method improves the reconstruction performance on both static and dynamic scenes.
Problem

Research questions and friction points this paper is trying to address.

Enhances 3D scene reconstruction from limited 2D views
Improves neural rendering for complex and dynamic scenes
Uses video interpolation to augment data for better rendering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Video frame interpolation for data augmentation
Lightweight high-quality PS4PRO model
Implicit world prior for 3D reconstruction
πŸ”Ž Similar Papers
No similar papers found.
Yezhi Shen
Yezhi Shen
PhD student of ECE, Purdue University
Computer Vision
Q
Qiuchen Zhai
School of Electrical and Computer Engineering, Purdue University
F
Fengqing Zhu
School of Electrical and Computer Engineering, Purdue University