🤖 AI Summary
To address severe motion blur induced by high-speed motion and the reconstruction challenges posed by SPAD’s binary imaging modality, this paper proposes the first end-to-end neural rendering framework tailored for SPAD imaging. Methodologically, we design a SPAD-aware NeRF variant incorporating 3D spatial filtering for effective binary-image denoising; introduce a reference-free generative prior jointly with single-frame blur-map guidance for robust color recovery; and extend the framework to support dynamic scene modeling. Our key contributions are: (1) the first deep integration of SPAD physics, binary measurement characteristics, and neural radiance fields; (2) significant improvements in geometric accuracy and appearance fidelity on real-world PhotonScenes data under motion-blurred conditions; and (3) reconstructions that directly enable downstream vision tasks—including segmentation, detection, and editing—without additional post-processing.
📝 Abstract
Advances in 3D reconstruction using neural rendering have enabled high-quality 3D capture. However, they often fail when the input imagery is corrupted by motion blur, due to fast motion of the camera or the objects in the scene. This work advances neural rendering techniques in such scenarios by using single-photon avalanche diode (SPAD) arrays, an emerging sensing technology capable of sensing images at extremely high speeds. However, the use of SPADs presents its own set of unique challenges in the form of binary images, that are driven by stochastic photon arrivals. To address this, we introduce PhotonSplat, a framework designed to reconstruct 3D scenes directly from SPAD binary images, effectively navigating the noise vs. blur trade-off. Our approach incorporates a novel 3D spatial filtering technique to reduce noise in the renderings. The framework also supports both no-reference using generative priors and reference-based colorization from a single blurry image, enabling downstream applications such as segmentation, object detection and appearance editing tasks. Additionally, we extend our method to incorporate dynamic scene representations, making it suitable for scenes with moving objects. We further contribute PhotonScenes, a real-world multi-view dataset captured with the SPAD sensors.