Benchmarking Recurrent Event-Based Object Detection for Industrial Multi-Class Recognition on MTEvent

📅 2026-03-23

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the lack of systematic evaluation of event-based object detection in industrial multi-class environments, where prior work has largely focused on outdoor driving or limited-category scenarios. The authors present the first benchmark of the recurrent architecture ReYOLOv8s on the MTEvent dataset, systematically analyzing the impact of temporal memory, pretraining domain alignment, and event clip length on detection performance relative to a non-recurrent YOLOv8s baseline. Experimental results show that the best recurrent model trained from scratch (C21) achieves an mAP50 of 0.285, representing a 9.6% improvement over the baseline. Further gains are realized through pretraining on GEN1 followed by fine-tuning, yielding an mAP50 of 0.329, with performance consistently increasing as clip length grows—demonstrating the critical role of recurrent modeling and temporal information in multi-class event-based detection for industrial settings.

Technology Category

Application Category

📝 Abstract

Event cameras are attractive for industrial robotics because they provide high temporal resolution, high dynamic range, and reduced motion blur. However, most event-based object detection studies focus on outdoor driving scenarios or limited class settings. In this work, we benchmark recurrent ReYOLOv8s on MTEvent for industrial multi-class recognition and use a non-recurrent YOLOv8s variant as a baseline to analyze the effect of temporal memory. On the MTEvent validation split, the best scratch recurrent model (C21) reaches 0.285 mAP50, corresponding to a 9.6% relative improvement over the nonrecurrent YOLOv8s baseline (0.260). Event-domain pretraining has a stronger effect: GEN1-initialized fine-tuning yields the best overall result of 0.329 mAP50 at clip length 21, and unlike scratch training, GEN1-pretrained models improve consistently with clip length. PEDRo initialization drops to 0.251, indicating that mismatched source-domain pretraining can be less effective than training from scratch. Persistent failure modes are dominated by class imbalance and human-object interaction. Overall, we position this work as a focused benchmarking and analysis study of recurrent event-based detection in industrial environments.

Problem

Research questions and friction points this paper is trying to address.

event-based object detection

industrial multi-class recognition

recurrent detection

temporal memory

benchmarking

Innovation

Methods, ideas, or system contributions that make the work stand out.

event-based vision

recurrent object detection

industrial multi-class recognition

temporal memory

domain pretraining

🔎 Similar Papers

No similar papers found.

Authors to Follow