Benchmarking Recurrent Event-Based Object Detection for Industrial Multi-Class Recognition on MTEvent

📅 2026-03-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic evaluation of event-based object detection in industrial multi-class environments, where prior work has largely focused on outdoor driving or limited-category scenarios. The authors present the first benchmark of the recurrent architecture ReYOLOv8s on the MTEvent dataset, systematically analyzing the impact of temporal memory, pretraining domain alignment, and event clip length on detection performance relative to a non-recurrent YOLOv8s baseline. Experimental results show that the best recurrent model trained from scratch (C21) achieves an mAP50 of 0.285, representing a 9.6% improvement over the baseline. Further gains are realized through pretraining on GEN1 followed by fine-tuning, yielding an mAP50 of 0.329, with performance consistently increasing as clip length grows—demonstrating the critical role of recurrent modeling and temporal information in multi-class event-based detection for industrial settings.

Technology Category

Application Category

📝 Abstract
Event cameras are attractive for industrial robotics because they provide high temporal resolution, high dynamic range, and reduced motion blur. However, most event-based object detection studies focus on outdoor driving scenarios or limited class settings. In this work, we benchmark recurrent ReYOLOv8s on MTEvent for industrial multi-class recognition and use a non-recurrent YOLOv8s variant as a baseline to analyze the effect of temporal memory. On the MTEvent validation split, the best scratch recurrent model (C21) reaches 0.285 mAP50, corresponding to a 9.6% relative improvement over the nonrecurrent YOLOv8s baseline (0.260). Event-domain pretraining has a stronger effect: GEN1-initialized fine-tuning yields the best overall result of 0.329 mAP50 at clip length 21, and unlike scratch training, GEN1-pretrained models improve consistently with clip length. PEDRo initialization drops to 0.251, indicating that mismatched source-domain pretraining can be less effective than training from scratch. Persistent failure modes are dominated by class imbalance and human-object interaction. Overall, we position this work as a focused benchmarking and analysis study of recurrent event-based detection in industrial environments.
Problem

Research questions and friction points this paper is trying to address.

event-based object detection
industrial multi-class recognition
recurrent detection
temporal memory
benchmarking
Innovation

Methods, ideas, or system contributions that make the work stand out.

event-based vision
recurrent object detection
industrial multi-class recognition
temporal memory
domain pretraining
🔎 Similar Papers
No similar papers found.
L
Lokeshwaran Manohar
Chair of Material Handling and Warehousing, Technical University of Dortmund, Dortmund, Germany
Moritz Roidl
Moritz Roidl
TU Dortmund University