EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Mamba-based vision models for event-based video reconstruction (EBVR) suffer from two key limitations: (1) lack of spatial translation invariance, and (2) insufficient modeling of spatiotemporal locality due to static window partitioning and conventional raster-scan traversal. To address these, we propose RWO-Mamba—a novel architecture featuring: (i) a Random Window Offset (RWO) mechanism to enhance translation robustness; (ii) a spatiotemporally consistent traversal serialization scheme that explicitly captures fine-grained spatiotemporal proximity in event streams; and (iii) synergistic integration of state space models (SSMs), adaptive event feature encoding, and dynamic window partitioning. Extensive experiments on multiple EBVR benchmarks demonstrate substantial improvements in PSNR and SSIM, alongside superior visual fidelity. Moreover, RWO-Mamba achieves 3.2× faster inference than Transformer-based methods and reduces FLOPs by 41%.

Technology Category

Application Category

📝 Abstract
Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.
Problem

Research questions and friction points this paper is trying to address.

Enhancing spatio-temporal locality in event-based video reconstruction
Overcoming static window partitions in Mamba for event-driven tasks
Improving local event relationships and computation speed in EBVR
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random window offset for spatial flexibility
Consistent traversal for spatio-temporal locality
Enhanced Mamba for event-based video reconstruction
🔎 Similar Papers
No similar papers found.
C
Chengjie Ge
University of Science and Technology of China, China
X
Xueyang Fu
University of Science and Technology of China, China
P
Peng He
University of Science and Technology of China, China
Kunyu Wang
Kunyu Wang
University of Science and Technology of China
Computer VisionEmbodied AI
C
Chengzhi Cao
University of Science and Technology of China, China
Z
Zheng-Jun Zha
University of Science and Technology of China, China