🤖 AI Summary
This work addresses HDR 3D scene reconstruction from casually captured videos exhibiting automatic exposure, severe motion blur, and unknown per-frame exposure times. We propose the first end-to-end method for this challenging setting. Our core innovation is a unified, differentiable physical imaging model that jointly optimizes the camera response function (CRF), per-frame exposure durations, camera poses, and an HDR 3D Gaussian splatting representation. By incorporating continuous-time camera trajectory modeling and differentiable HDR rendering, our approach achieves robust, single-stage, registration-free, and motion-blur-robust reconstruction. Unlike conventional HDR reconstruction methods—which rely on fixed camera poses and controlled multi-exposure sequences—ours operates directly on unstructured, in-the-wild video. Experiments demonstrate substantial improvements in HDR reconstruction fidelity and novel-view synthesis quality on real-world complex scenes, outperforming state-of-the-art HDR-NeRF and HDR-3DGS. The method supports input from consumer-grade devices, including smartphones.
📝 Abstract
Recently, photo-realistic novel view synthesis from multi-view images, such as neural radiance field (NeRF) and 3D Gaussian Splatting (3DGS), have garnered widespread attention due to their superior performance. However, most works rely on low dynamic range (LDR) images, which limits the capturing of richer scene details. Some prior works have focused on high dynamic range (HDR) scene reconstruction, typically require capturing of multi-view sharp images with different exposure times at fixed camera positions during exposure times, which is time-consuming and challenging in practice. For a more flexible data acquisition, we propose a one-stage method: extbf{CasualHDRSplat} to easily and robustly reconstruct the 3D HDR scene from casually captured videos with auto-exposure enabled, even in the presence of severe motion blur and varying unknown exposure time. extbf{CasualHDRSplat} contains a unified differentiable physical imaging model which first applies continuous-time trajectory constraint to imaging process so that we can jointly optimize exposure time, camera response function (CRF), camera poses, and sharp 3D HDR scene. Extensive experiments demonstrate that our approach outperforms existing methods in terms of robustness and rendering quality. Our source code will be available at https://github.com/WU-CVGL/CasualHDRSplat