Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the heavy manual burden of generating histopathological reports for melanocytic skin lesions, this work introduces the first domain-specific vision-language model tailored for dermatopathology. Methodologically, it pioneers the adaptation of the Contrastive Captioner framework to hematoxylin and eosin (H&E)-stained whole-slide images (WSIs), integrating multi-scale WSI encoding with structured text generation and enabling bidirectional cross-modal retrieval. Evaluated on a large-scale dataset comprising 42,512 WSIs and 19,645 clinical reports, the model achieves pathology expert-rated report quality statistically equivalent to human-written reports for common nevi (p > 0.05), while significantly improving retrieval accuracy for rare subtypes. This work advances AI-assisted dermatopathological diagnosis toward clinical interpretability and generalizability.

Technology Category

Application Category

📝 Abstract
Millions of melanocytic skin lesions are examined by pathologists each year, the majority of which concern common nevi (i.e., ordinary moles). While most of these lesions can be diagnosed in seconds, writing the corresponding pathology report is much more time-consuming. Automating part of the report writing could, therefore, alleviate the increasing workload of pathologists. In this work, we develop a vision-language model specifically for the pathology domain of cutaneous melanocytic lesions. The model follows the Contrastive Captioner framework and was trained and evaluated using a melanocytic lesion dataset of 42,512 H&E-stained whole slide images and 19,645 corresponding pathology reports. Our results show that the quality scores of model-generated reports were on par with pathologist-written reports for common nevi, assessed by an expert pathologist in a reader study. While report generation revealed to be more difficult for rare melanocytic lesion subtypes, the cross-modal retrieval performance for these cases was considerably better.
Problem

Research questions and friction points this paper is trying to address.

Automate pathology report writing
Improve workload efficiency for pathologists
Enhance multimodal learning for skin lesions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-language model for pathology
Contrastive Captioner framework
Multimodal representation learning
🔎 Similar Papers
No similar papers found.
R
Ruben T. Lucassen
Dept. of Pathology, University Medical Center Utrecht, the Netherlands; Dept. of Biomedical Engineering, Eindhoven University of Technology, the Netherlands
S
Sander P.J. Moonemans
Dept. of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
T
Tijn van de Luijtgaarden
Dept. of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
G
Gerben E. Breimer
Dept. of Pathology, University Medical Center Utrecht, the Netherlands
W
Willeke A.M. Blokx
Dept. of Pathology, University Medical Center Utrecht, the Netherlands
Mitko Veta
Mitko Veta
Associate Professor, Eindhoven University of Technology
Medical Image AnalysisDigital PathologyMachine Learning