🤖 AI Summary
To address the complexity and high computational cost of conventional multi-instance learning (MIL)-based fine-tuning paradigms for whole-slide image (WSI) analysis, this paper proposes a lightweight slide-level adaptation strategy. Our method enables efficient transfer of foundation models to slide-level tasks using only mean pooling over patch features and a simple multilayer perceptron (SiMLP), eliminating the need for dedicated MIL architectures. It supports weakly supervised learning, few-shot adaptation, and cross-task transfer. On pan-cancer classification, our approach outperforms state-of-the-art MIL methods by 3.52% in accuracy. It demonstrates strong robustness in lung adenocarcinoma vs. squamous cell carcinoma subtyping and achieves performance comparable to large-scale pre-trained slide-level models. To our knowledge, this is the first work to empirically validate the effectiveness and generalizability of a minimal architectural design—requiring no MIL-specific components—for downstream WSI-level tasks.
📝 Abstract
The emergence of foundation models in computational pathology has transformed histopathological image analysis, with whole slide imaging (WSI) diagnosis being a core application. Traditionally, weakly supervised fine-tuning via multiple instance learning (MIL) has been the primary method for adapting foundation models to WSIs. However, in this work we present a key experimental finding: a simple nonlinear mapping strategy combining mean pooling and a multilayer perceptron, called SiMLP, can effectively adapt patch-level foundation models to slide-level tasks without complex MIL-based learning. Through extensive experiments across diverse downstream tasks, we demonstrate the superior performance of SiMLP with state-of-the-art methods. For instance, on a large-scale pan-cancer classification task, SiMLP surpasses popular MIL-based methods by 3.52%. Furthermore, SiMLP shows strong learning ability in few-shot classification and remaining highly competitive with slide-level foundation models pretrained on tens of thousands of slides. Finally, SiMLP exhibits remarkable robustness and transferability in lung cancer subtyping. Overall, our findings challenge the conventional MIL-based fine-tuning paradigm, demonstrating that a task-agnostic representation strategy alone can effectively adapt foundation models to WSI analysis. These insights offer a unique and meaningful perspective for future research in digital pathology, paving the way for more efficient and broadly applicable methodologies.