The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of large-scale, annotated datasets for India’s complex traffic environments, this paper introduces UVH-26—the first high-quality, real-world traffic image dataset tailored to India, comprising 26,646 high-resolution images captured from 2,800 camera viewpoints across Bangalore and covering 14 locally prevalent vehicle categories. Robust consensus annotations were generated via crowdsourcing, refined using majority voting and the STAPLE algorithm. Leveraging UVH-26, we systematically benchmark state-of-the-art object detectors—including YOLOv11, RT-DETR, and DAMO-YOLO—achieving mAP₅₀:₉₅ improvements of 8.4–31.5% over COCO-pretrained baselines; RT-DETR-X attains 0.67. This work establishes the first benchmark for heterogeneous traffic scenarios in emerging economies, bridging a critical gap in intelligent transportation research. We fully open-source all images, annotations, and six fine-tuned detection models to accelerate deployment of AI-driven traffic systems in developing countries.

Technology Category

Application Category

📝 Abstract
This report describes the UVH-26 dataset, the first public release by AIM@IISc of a large-scale dataset of annotated traffic-camera images from India. The dataset comprises 26,646 high-resolution (1080p) images sampled from 2800 Bengaluru's Safe-City CCTV cameras over a 4-week period, and subsequently annotated through a crowdsourced hackathon involving 565 college students from across India. In total, 1.8 million bounding boxes were labeled across 14 vehicle classes specific to India: Cycle, 2-Wheeler (Motorcycle), 3-Wheeler (Auto-rickshaw), LCV (Light Commercial Vehicles), Van, Tempo-traveller, Hatchback, Sedan, SUV, MUV, Mini-bus, Bus, Truck and Other. Of these, 283k-316k consensus ground truth bounding boxes and labels were derived for distinct objects in the 26k images using Majority Voting and STAPLE algorithms. Further, we train multiple contemporary detectors, including YOLO11-S/X, RT-DETR-S/X, and DAMO-YOLO-T/L using these datasets, and report accuracy based on mAP50, mAP75 and mAP50:95. Models trained on UVH-26 achieve 8.4-31.5% improvements in mAP50:95 over equivalent baseline models trained on COCO dataset, with RT-DETR-X showing the best performance at 0.67 (mAP50:95) as compared to 0.40 for COCO-trained weights for common classes (Car, Bus, and Truck). This demonstrates the benefits of domain-specific training data for Indian traffic scenarios. The release package provides the 26k images with consensus annotations based on Majority Voting (UVH-26-MV) and STAPLE (UVH-26-ST) and the 6 fine-tuned YOLO and DETR models on each of these datasets. By capturing the heterogeneity of Indian urban mobility directly from operational traffic-camera streams, UVH-26 addresses a critical gap in existing global benchmarks, and offers a foundation for advancing detection, classification, and deployment of intelligent transportation systems in emerging nations with complex traffic conditions.
Problem

Research questions and friction points this paper is trying to address.

Creating first large-scale annotated traffic image dataset for Indian urban scenarios
Developing accurate vision models for vehicle detection in complex Indian traffic
Addressing domain gap in global benchmarks for emerging nation transportation systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Crowdsourced hackathon created Indian traffic dataset
Majority Voting and STAPLE algorithms generated consensus annotations
Fine-tuned YOLO and DETR models for Indian conditions
🔎 Similar Papers
No similar papers found.
A
Akash Sharma
Department of Computation and Data Sciences (CDS), Indian Institute of Science, Bengaluru, India
C
Chinmay Mhatre
Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP), Indian Institute of Science, Bengaluru, India
S
Sankalp Gawali
Department of Computation and Data Sciences (CDS), Indian Institute of Science, Bengaluru, India
R
Ruthvik Bokkasam
Department of Computation and Data Sciences (CDS), Indian Institute of Science, Bengaluru, India
B
Brij Kishore
Centre for Data for Public Good (CDPG), Indian Institute of Science, Bengaluru, India
Vishwajeet Pattanaik
Vishwajeet Pattanaik
Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP), Indian Institute of Science, Bengaluru, India
T
Tarun Rambha
Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP), Indian Institute of Science, Bengaluru, India
A
A. Pinjari
Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP), Indian Institute of Science, Bengaluru, India
V
Vijay Kovvali
Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP), Indian Institute of Science, Bengaluru, India
A
Anirban Chakraborty
Department of Computation and Data Sciences (CDS), Indian Institute of Science, Bengaluru, India
Punit Rathore
Punit Rathore
Assistant Professor, Indian Institute of Science, Bangalore, India
Unsupervised LearningSpatio-temporal Data MiningAnomaly DetectionIntelligent Transportation
R
R. Krishnapuram
Centre for Data for Public Good (CDPG), Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP), Indian Institute of Science, Bengaluru, India
Y
Yogesh L. Simmhan
Department of Computation and Data Sciences (CDS), Indian Institute of Science, Bengaluru, India