Mitosis Detection in the Wild: Multi-Tumor and Context-Aware Generalization in the MIDOG 2025 Challenge

📅 2026-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited generalization of mitosis detection models in real-world clinical settings by constructing a comprehensive test set comprising 365 cases spanning 12 human and animal tumor types across multiple scanning platforms. It introduces, for the first time, an integrated evaluation framework encompassing multi-tumor, multi-species, and multi-context (hotspot, random, and challenging regions) scenarios, along with a novel atypical mitosis classification task. Through systematic assessment using large-scale data, domain generalization protocols, and model ensembling strategies, the study demonstrates that the best-performing model achieves an F1 score of 0.740, while the highest balanced accuracy for atypical mitosis classification reaches 0.908. Model ensembling yields average improvements of 1.5 and 1.3 percentage points in F1 score and accuracy, respectively, whereas test-time augmentation shows no significant benefit, revealing persistent performance gaps in challenging regions and rare tumor types.
📝 Abstract
Automated mitosis detection is a well-established task in computational pathology. While previous benchmarks focused on scanner-induced domain shift, clinical "real-world" application requires models to be robust across the vast variance to be expected in the histological landscape. The MItosis DOmain Generalization (MIDOG) 2025 challenge was designed to evaluate algorithmic performance across unprecedented biological and contextual diversity. We curated a test dataset of 365 cases, encompassing 12 distinct human, canine and feline tumor types, digitized across multiple scanning platforms. Moving beyond hand-selected hotspots, the challenge required detection also in random tissue areas (representative of the whole slide detection situation) and challenging areas (areas rich in hard negatives). In the second track, we introduced the classification of atypical mitotic figures (AMFs). There were 18 teams submitting to the detection track, with F1 scores ranging up to 0.740. In the AMF detection track, we had 21 submissions with balanced accuracy values up to 0.908. Our analysis reveals that while most models perform reliably in traditional hotspots, significant performance degradation occurs in challenging ROIs, where false positive rates tripled. Furthermore, performance varied significantly across the 12 tumor types, highlighting "blind spots" in current state-of-the-art architectures when encountering rare or highly pleomorphic malignancies. Moreover, we evaluated the effectiveness of ensembling and found a mean increases of 1.5 and 1.3 percentage points in F1 score and balanced accuracy, respectively. In contrast, TTA showed no relevant improvement. MIDOG 2025 demonstrates that "in the wild" mitosis detection remains a significant hurdle. The transition from hotspot-only evaluation to a multi-contextual framework provides a more realistic proxy for clinical reliability.
Problem

Research questions and friction points this paper is trying to address.

mitosis detection
domain generalization
multi-tumor
computational pathology
atypical mitotic figures
Innovation

Methods, ideas, or system contributions that make the work stand out.

domain generalization
mitosis detection
computational pathology
atypical mitotic figures
multi-tumor evaluation
🔎 Similar Papers
No similar papers found.
Marc Aubreville
Marc Aubreville
Professor at Flensburg University of Applied Sciences, Flensburg, Germany
Computer VisionDeep LearningSignal Processing
Jonas Ammeling
Jonas Ammeling
Technische Hochschule Ingolstadt
Computer VisionDeep LearningComputational Pathology
Sweta Banerjee
Sweta Banerjee
Research Assistant - Flensburg University of Applied Sciences
self-supervised learningdomain adaptationmulti-modal approaches in histopathology
V
Viktoria Weiss
T
Taryn A. Donovan
R
Robert Klopfleisch
Jiaqi Lv
Jiaqi Lv
Southeast University
Machine Learning
Shan E Ahmed Raza
Shan E Ahmed Raza
Associate Professor, University of Warwick UK
Computational PathologyDeep Learning and Artificial IntelligenceTumour MicroenvironmentHistoGenomics
R
Raphaël Bourgade
Thomas Walter
Thomas Walter
Full Professor, Mines Paris, PSL University and Institut Curie
Computer VisionArtificial IntelligenceComputational PathologyHigh Content Screening
Y
Yasemin Topuz
S
Songül Varlı
C
Charles-Antoine Collins-Fekete
Z
Zhuoyan Shen
N
Navya Sri Kelam
Nitin Singhal
Nitin Singhal
Head of Products, AIRA Matrix
Artificial intelligencedeep learningcomputer visionHPCcomputational photography
Christian Marzahl
Christian Marzahl
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Computer VisionDeep LearningDigital Pathology
B
Brian Napora
T
Tengyou Xu
H
Hongyan Gu
M
Mario Vento
Gennaro Percannella
Gennaro Percannella
University of Salerno
Pattern RecognitionComputer VisionReal time audio and video processingBiomedical Image Analysis
N
Norbert Ropiak
I
Izabela Wasiak
Jie Xiao
Jie Xiao
University of Science and Technology of China
low level visiongenerative modelmachine learning