🤖 AI Summary
This study addresses the diagnostic variability in whole-slide image (WSI) interpretation for prostate cancer Gleason grading, where discrepancies between expert and non-expert pathologists adversely affect model performance. To tackle this challenge, the authors introduce the novel concept of “whole-slide difficulty” (WSD), which quantifies inter-group diagnostic inconsistency and integrates it into a multiple instance learning (MIL) framework. Two strategies—multi-task learning and weighted classification loss—are employed to incorporate WSD across diverse feature encoders and MIL architectures. The proposed approach significantly improves the model’s accuracy in identifying high-grade (more severe) Gleason scores, with particularly notable gains on diagnostically challenging WSIs.
📝 Abstract
Multiple Instance Learning (MIL) has been widely applied in histopathology to classify Whole Slide Images (WSIs) with slide-level diagnoses. While the ground truth is established by expert pathologists, the slides can be difficult to diagnose for non-experts and lead to disagreements between the annotators. In this paper, we introduce the notion of Whole Slide Difficulty (WSD), based on the disagreement between an expert and a non-expert pathologist. We propose two different methods to leverage WSD, a multi-task approach and a weighted classification loss approach, and we apply them to Gleason grading of prostate cancer slides. Results show that integrating WSD during training consistently improves the classification performance across different feature encoders and MIL methods, particularly for higher Gleason grades (i.e. worse diagnosis).