🤖 AI Summary
To address the challenge of modeling dynamic 3D scenes across time from few-shot inputs, this paper proposes Temporal Generalizable NeRF—a framework enabling novel-view synthesis at arbitrary camera poses and timestamps (e.g., dawn to dusk) using only a sparse set of multi-view images. Methodologically, we introduce the first cross-dataset feature disentanglement strategy, jointly integrating multi-view stereo, volumetric rendering, and explicit illumination-geometry decoupling to achieve temporally continuous implicit scene representation—without per-scene optimization. Our contributions are threefold: (1) We establish the first few-shot benchmark for temporally continuous NeRF modeling; (2) We enable generalizable neural rendering under varying illumination conditions; and (3) We significantly improve visual realism and temporal coherence, demonstrating strong potential for immersive applications such as the metaverse on real-world scenes.
📝 Abstract
We present TimeNeRF, a generalizable neural rendering approach for rendering novel views at arbitrary viewpoints and at arbitrary times, even with few input views. For real-world applications, it is expensive to collect multiple views and inefficient to re-optimize for unseen scenes. Moreover, as the digital realm, particularly the metaverse, strives for increasingly immersive experiences, the ability to model 3D environments that naturally transition between day and night becomes paramount. While current techniques based on Neural Radiance Fields (NeRF) have shown remarkable proficiency in synthesizing novel views, the exploration of NeRF's potential for temporal 3D scene modeling remains limited, with no dedicated datasets available for this purpose. To this end, our approach harnesses the strengths of multi-view stereo, neural radiance fields, and disentanglement strategies across diverse datasets. This equips our model with the capability for generalizability in a few-shot setting, allows us to construct an implicit content radiance field for scene representation, and further enables the building of neural radiance fields at any arbitrary time. Finally, we synthesize novel views of that time via volume rendering. Experiments show that TimeNeRF can render novel views in a few-shot setting without per-scene optimization. Most notably, it excels in creating realistic novel views that transition smoothly across different times, adeptly capturing intricate natural scene changes from dawn to dusk.