Label Curation Using Agentic AI

πŸ“… 2026-01-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of high cost, low efficiency, significant annotator bias, and unreliable label quality in large-scale multimodal data annotation by proposing AURA, a novel framework that integrates multi-agent collaboration with confusion matrix–based annotator reliability modeling. Without access to ground-truth labels, AURA jointly infers latent true labels and annotator reliability through a probabilistic graphical model combined with the Expectation-Maximization (EM) algorithm, enabling high-quality automatic label aggregation without manual pre-validation. Experimental results demonstrate that AURA improves accuracy by up to 5.8% across four benchmark datasets, with gains as high as 50% in low-quality annotation scenarios, while also providing precise quantification of annotator trustworthiness.

Technology Category

Application Category

πŸ“ Abstract
Data annotation is essential for supervised learning, yet producing accurate, unbiased, and scalable labels remains challenging as datasets grow in size and modality. Traditional human-centric pipelines are costly, slow, and prone to annotator variability, motivating reliability-aware automated annotation. We present AURA (Agentic AI for Unified Reliability Modeling and Annotation Aggregation), an agentic AI framework for large-scale, multi-modal data annotation. AURA coordinates multiple AI agents to generate and validate labels without requiring ground truth. At its core, AURA adapts a classical probabilistic model that jointly infers latent true labels and annotator reliability via confusion matrices, using Expectation-Maximization to reconcile conflicting annotations and aggregate noisy predictions. Across the four benchmark datasets evaluated, AURA achieves accuracy improvements of up to 5.8% over baseline. In more challenging settings with poor quality annotators, the improvement is up to 50% over baseline. AURA also accurately estimates the reliability of annotators, allowing assessment of annotator quality even without any pre-validation steps.
Problem

Research questions and friction points this paper is trying to address.

data annotation
label curation
annotator reliability
supervised learning
multi-modal data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic AI
Label Curation
Reliability Modeling
Expectation-Maximization
Multi-modal Annotation
S
Subhodeep Ghosh
New Jersey Institute of Technology
B
Bayan Divaaniaazar
New Jersey Institute of Technology
M
Md Ishat-E-Rabban
New Jersey Institute of Technology
S
Spencer Clarke
New Jersey Institute of Technology
Senjuti Basu Roy
Senjuti Basu Roy
Panasonic Chair in Sustainability and Associate Professor, Department of Computer Science, NJIT
human-in-the-loop AI(Responsible) data management for AIAlgorithm DesignReinforcement LearningFuture of Work