Beyond Human Performance: A Vision-Language Multi-Agent Approach for Quality Control in Pharmaceutical Manufacturing

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This study addresses the limitations of traditional colony counting methods, which are inefficient and error-prone, and the performance degradation of existing deep learning models under poor sample quality or in the presence of contaminants—challenges that hinder compliance with the pharmaceutical industry’s stringent accuracy and regulatory requirements. To overcome these issues, this work proposes the first multi-agent quality control system integrating a vision-language model (VLM) with deep learning detectors. The VLM pre-screens valid Petri dishes, while two independent agents perform colony counting and cross-verify results; consistent outputs are automatically recorded, whereas discrepancies trigger expert review, enabling closed-loop optimization via feedback. Built on Detectron2, YOLO, and SAP/Postgres, the system significantly enhances robustness and auditability. Experiments demonstrate a reduction in manual review rate from 50% to 15%, with a detection rate of 99%, 2% false positives, and 0.6% false negatives, offering a scalable and compliant automation solution for pharmaceutical quality control.

Technology Category

Application Category

📝 Abstract

Colony-forming unit (CFU) detection is critical in pharmaceutical manufacturing, serving as a key component of Environmental Monitoring programs and ensuring compliance with stringent quality standards. Manual counting is labor-intensive and error-prone, while deep learning (DL) approaches, though accurate, remain vulnerable to sample quality variations and artifacts. Building on our earlier CNN-based framework (Beznik et al., 2020), we evaluated YOLOv5, YOLOv7, and YOLOv8 for CFU detection; however, these achieved only 97.08 percent accuracy, insufficient for pharmaceutical-grade requirements. A custom Detectron2 model trained on GSK's dataset of over 50,000 Petri dish images achieved 99 percent detection rate with 2 percent false positives and 0.6 percent false negatives. Despite high validation accuracy, Detectron2 performance degrades on outlier cases including contaminated plates, plastic artifacts, or poor optical clarity. To address this, we developed a multi-agent framework combining DL with vision-language models (VLMs). The VLM agent first classifies plates as valid or invalid. For valid samples, both DL and VLM agents independently estimate colony counts. When predictions align within 5 percent, results are automatically recorded in Postgres and SAP; otherwise, samples are routed for expert review. Expert feedback enables continuous retraining and self-improvement. Initial DL-based automation reduced human verification by 50 percent across vaccine manufacturing sites. With VLM integration, this increased to 85 percent, delivering significant operational savings. The proposed system provides a scalable, auditable, and regulation-ready solution for microbiological quality control, advancing automation in biopharmaceutical production.

Problem

Research questions and friction points this paper is trying to address.

CFU detection

pharmaceutical manufacturing

quality control

outlier robustness

microbiological monitoring

Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-language model

multi-agent system

colony-forming unit detection