High-Speed Dynamic 3D Imaging with Sensor Fusion Splatting

📅 2025-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of RGB cameras—namely, low frame rate, motion blur from long exposure, and insufficient stereo baseline—in high-speed dynamic 3D scene reconstruction, this paper proposes the first multi-modal deformable 3D Gaussian representation framework integrating RGB, depth, and event cameras. Methodologically, it unifies microsecond-resolution event streams with dense RGB/depth observations into a time-varying 3D Gaussian representation, jointly optimizing geometry, appearance, and temporal deformation fields. Key technical innovations include event-driven temporal modeling, cross-modal differentiable rendering, and multi-sensor spatiotemporal alignment. Experiments on both synthetic and real-world datasets demonstrate significant improvements over state-of-the-art methods, achieving superior rendering fidelity and structural accuracy under challenging conditions—including low-light illumination, narrow stereo baselines, and rapid motion. This work marks the first successful integration of event cameras with conventional vision modalities within the deformable Gaussian splatting paradigm.

Technology Category

Application Category

📝 Abstract
Capturing and reconstructing high-speed dynamic 3D scenes has numerous applications in computer graphics, vision, and interdisciplinary fields such as robotics, aerodynamics, and evolutionary biology. However, achieving this using a single imaging modality remains challenging. For instance, traditional RGB cameras suffer from low frame rates, limited exposure times, and narrow baselines. To address this, we propose a novel sensor fusion approach using Gaussian splatting, which combines RGB, depth, and event cameras to capture and reconstruct deforming scenes at high speeds. The key insight of our method lies in leveraging the complementary strengths of these imaging modalities: RGB cameras capture detailed color information, event cameras record rapid scene changes with microsecond resolution, and depth cameras provide 3D scene geometry. To unify the underlying scene representation across these modalities, we represent the scene using deformable 3D Gaussians. To handle rapid scene movements, we jointly optimize the 3D Gaussian parameters and their temporal deformation fields by integrating data from all three sensor modalities. This fusion enables efficient, high-quality imaging of fast and complex scenes, even under challenging conditions such as low light, narrow baselines, or rapid motion. Experiments on synthetic and real datasets captured with our prototype sensor fusion setup demonstrate that our method significantly outperforms state-of-the-art techniques, achieving noticeable improvements in both rendering fidelity and structural accuracy.
Problem

Research questions and friction points this paper is trying to address.

High-speed dynamic 3D scene capture
Sensor fusion for complex imaging
Deforming scene reconstruction optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines RGB, depth, event cameras
Uses Gaussian splatting for 3D scenes
Optimizes temporal deformation fields
🔎 Similar Papers
No similar papers found.