ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing traffic atomic activity datasets predominantly adopt front-view perspectives and video-level annotations, limiting their applicability to holistic intersection dynamics modeling and precluding fine-grained temporal analysis on untrimmed videos. To address this, we propose ATARS—the first aerial-view, frame-level, multi-label dataset designed for comprehensive intersection-level traffic atomic activity analysis. We formally introduce the novel task of *multi-label temporal atomic activity recognition*, curate a large-scale aerial video corpus with meticulous manual frame-level annotations, and design an enhanced evaluation protocol tailored for small-object detection and recognition. Benchmarking state-of-the-art models—including MS-TCN and TADTR—reveals significant performance degradation under scale variation, occlusion, and concurrent multi-activity scenarios. ATARS establishes a reproducible benchmark for traffic understanding and exposes critical challenges for future research.

Technology Category

Application Category

📝 Abstract
Traffic Atomic Activity which describes traffic patterns for topological intersection dynamics is a crucial topic for the advancement of intelligent driving systems. However, existing atomic activity datasets are collected from an egocentric view, which cannot support the scenarios where traffic activities in an entire intersection are required. Moreover, existing datasets only provide video-level atomic activity annotations, which require exhausting efforts to manually trim the videos for recognition and limit their applications to untrimmed videos. To bridge this gap, we introduce the Aerial Traffic Atomic Activity Recognition and Segmentation (ATARS) dataset, the first aerial dataset designed for multi-label atomic activity analysis. We offer atomic activity labels for each frame, which accurately record the intervals for traffic activities. Moreover, we propose a novel task, Multi-label Temporal Atomic Activity Recognition, enabling the study of accurate temporal localization for atomic activity and easing the burden of manual video trimming for recognition. We conduct extensive experiments to evaluate existing state-of-the-art models on both atomic activity recognition and temporal atomic activity segmentation. The results highlight the unique challenges of our ATARS dataset, such as recognizing extremely small objects' activities. We further provide comprehensive discussion analyzing these challenges and offer valuable insights for future direction to improve recognizing atomic activity in aerial view. Our source code and dataset are available at https://github.com/magecliff96/ATARS/
Problem

Research questions and friction points this paper is trying to address.

Lack of aerial datasets for intersection traffic activity analysis
Absence of frame-level atomic activity labels in existing datasets
Need for multi-label temporal activity recognition in untrimmed videos
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aerial dataset for multi-label activity analysis
Frame-level atomic activity labels provided
Multi-label temporal activity recognition task introduced
🔎 Similar Papers
No similar papers found.
Z
Zihao Chen
National Chengchi University
H
Hsuanyu Wu
National Chengchi University
Chi-Hsi Kung
Chi-Hsi Kung
Visiting Researcher, Indiana University Bloomington
Video UnderstandingCompositional RepresentationRoboticsCognitive Science
Y
Yi-Ting Chen
National Yang Ming Chiao Tung University
Yan-Tsung Peng
Yan-Tsung Peng
National Chengchi University