VLCD: Vision-Language Contrastive Distillation for Accurate and Efficient Automatic Placenta Analysis

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Early placental pathology screening is critical for reducing perinatal risks, yet existing AI methods suffer from high computational overhead and poor deployability in resource-limited primary care settings. To address the cross-modal analysis requirement between placental photographs and pathological reports, this paper proposes a Text-Anchored Vision-Language Contrastive Knowledge Distillation (VLCD) framework. Our method introduces a novel unsupervised pre-distillation initialization strategy based on natural images, significantly enhancing robustness to low-quality and low-resolution inputs. It jointly integrates vision-language contrastive learning, knowledge distillation, and lightweight network compression. Experiments demonstrate that the distilled student model matches or surpasses the teacher’s performance while achieving a 3.2× speedup in inference latency and a 67% reduction in memory footprint. Notably, accuracy on low-resolution placental images improves by 8.2%, underscoring strong clinical deployability in real-world基层 settings.

Technology Category

Application Category

📝 Abstract
Pathological examination of the placenta is an effective method for detecting and mitigating health risks associated with childbirth. Recent advancements in AI have enabled the use of photographs of the placenta and pathology reports for detecting and classifying signs of childbirth-related pathologies. However, existing automated methods are computationally extensive, which limits their deployability. We propose two modifications to vision-language contrastive learning (VLC) frameworks to enhance their accuracy and efficiency: (1) text-anchored vision-language contrastive knowledge distillation (VLCD)-a new knowledge distillation strategy for medical VLC pretraining, and (2) unsupervised predistillation using a large natural images dataset for improved initialization. Our approach distills efficient neural networks that match or surpass the teacher model in performance while achieving model compression and acceleration. Our results showcase the value of unsupervised predistillation in improving the performance and robustness of our approach, specifically for lower-quality images. VLCD serves as an effective way to improve the efficiency and deployability of medical VLC approaches, making AI-based healthcare solutions more accessible, especially in resource-constrained environments.
Problem

Research questions and friction points this paper is trying to address.

Improving accuracy and efficiency of placenta analysis AI
Reducing computational cost for medical VLC deployment
Enhancing performance on lower-quality medical images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text-anchored vision-language contrastive knowledge distillation
Unsupervised predistillation using natural images
Efficient neural networks with model compression
🔎 Similar Papers
No similar papers found.
M
Manas Mehta
The Pennsylvania State University, University Park
Yimu Pan
Yimu Pan
PhD Candidate, The Pennsylvania State University
computer visionmultimodalmedical image analysis
K
Kelly Gallagher
The Pennsylvania State University, University Park
A
A. Gernand
The Pennsylvania State University, University Park
Jeffery A. Goldstein
Jeffery A. Goldstein
Associate Professor, Northwestern University
biomedical
D
Delia Mwinyelle
University of Chicago, Chicago
Leena B. Mithal
Leena B. Mithal
Northwestern University Feinberg School of Medicine; Lurie Children's
pediatric infectious diseasesperinatal infections
J
J. Z. Wang
The Pennsylvania State University, University Park