🤖 AI Summary
Computational pathology urgently requires general-purpose visual representations that are robust to variations in staining protocols, scanner types, and image resolution, and that generalize across diverse clinical endpoints. This work proposes a multi-scale training strategy for whole-slide images based on the DINOv2 self-supervised framework, incorporating continuous magnification sampling, cross-scale tissue view fusion, and a Gram-anchored dense consistency mechanism to jointly model cellular morphology and tissue architecture. The authors establish a large-scale, standardized evaluation benchmark encompassing 161 clinical tasks and validate their approach on 34,394 whole-slide images. Their method significantly outperforms existing approaches, achieving state-of-the-art average performance across diagnostic tasks, biomarker prediction, tissue context understanding, and prognosis.
📝 Abstract
Computational pathology requires visual representations that transfer across diverse clinical endpoints and remain robust to variation in magnification, staining, scanner type, slide preparation, and input resolution. We present DaX, a pathology vision foundation model that adapts DINOv3-style self-supervised learning to whole-slide histopathology. DaX is initialized from natural-image DINOv3 weights and incorporates continuous magnification training, cross-scale tissue views, orientation-agnostic and acquisition-robust augmentation, multi-input-size training, and Gram-anchored dense consistency. These designs aim to connect local cellular morphology with global tissue architecture while stabilizing dense token-level representations across input scales. We further construct a WSI-level benchmark comprising 161 clinically meaningful tasks from 44 public datasets, covering 28,182 patients and 34,394 slides across four clinical domains and nine task categories. All models are evaluated under a fixed patient-level cross-validation protocol with fold-level statistical ranking, enabling reproducible comparisons that are less sensitive to split-dependent variation. Across this benchmark, DaX achieves the highest mean performance across tasks and consistently strong task-level ranking scores, with gains spanning diagnostic pathology, biomarker and molecular profiling, tissue/specimen context, and risk, response, and prognosis. These results support DaX as a transferable visual encoder for computational pathology and provide a standardized evaluation framework for future pathology foundation models. Project page: https://alibaba-damo-academy.github.io/DaX/benchboard/.