DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training

📅 2024-07-12
🏛️ Expert systems with applications
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
Traditional object detection suffers from heavy reliance on labor-intensive manual annotations, poor generalization, and limited adaptability to novel categories and dynamic environments. To address these challenges, this work proposes an end-to-end fully automated detection pipeline. Methodologically, it introduces— for the first time—a unified framework integrating CLIP-driven open-vocabulary localization, diffusion model–enhanced feature representation, uncertainty-aware pseudo-label filtering, and an interactive human verification mechanism. This enables zero-shot category extension and closed-loop optimization with controllable annotation quality. Built upon a fine-tuned YOLOv8 backbone, the method achieves 92% of the full-supervision state-of-the-art mAP on COCO and LVIS using only 15% of the manual annotations required by conventional approaches. The proposed pipeline significantly reduces annotation cost while substantially improving cross-domain generalization capability.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Automates end-to-end object detection without manual labeling
Adapts to diverse environments and novel object categories
Improves accuracy from data collection to model training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated end-to-end object detection pipeline
Open-vocabulary bounding box annotation
Pseudo-label review via multimodal models
🔎 Similar Papers
No similar papers found.