LRMIL: Efficient Low-Resolution Multiple Instance Learning via High-Resolution Knowledge Distillation for Whole Slide Image Classification

๐Ÿ“… 2026-06-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge in whole-slide image (WSI) multiple instance learning, where existing methods relying on high-resolution patches struggle to balance global context modeling with computational efficiency. The authors propose LRMIL, a novel framework that, for the first time, enables cross-scale knowledge distillation from high- to low-resolution representations. It first aligns feature embeddings of high- and low-resolution patches at the instance level and then trains a lightweight student model at the slide level by jointly leveraging label supervision and guidance from a high-resolution teacher model. During inference, only low-resolution patches are required, achieving state-of-the-art performance across multiple WSI benchmarks while substantially reducing computational and preprocessing overheadโ€”thus offering an effective trade-off between accuracy and clinical practicality.
๐Ÿ“ Abstract
Multiple instance learning (MIL) has become a standard paradigm for whole slide image (WSI) analysis in digital pathology, as it enables slide-level prediction without dense annotations. Existing MIL methods typically rely on exhaustive extraction and encoding of high-resolution patches. However, this practice suffers from two critical limitations in real-world clinical settings: it struggles to capture global visual cues at lower magnifications, and incurs substantial computational overhead due to the massive number of high-resolution patches per slide. To address these limitations, we propose an efficient low-resolution multiple instance learning (LRMIL) framework that transfers high-resolution knowledge to low-resolution representations. LRMIL adopts a two-stage distillation strategy. First, patch-level cross-resolution distillation aligns low-resolution patch embeddings with high-resolution representations. Second, slide-level knowledge distillation trains a low-resolution student MIL model under both slide-level supervision and teacher guidance. At inference time, LRMIL operates exclusively on low-resolution patches, substantially reducing data preprocessing and computational cost. Extensive experiments on multiple WSI benchmarks demonstrate that LRMIL consistently outperforms state-of-the-art MIL methods while achieving more efficient inference. These results highlight LRMIL as a practical and scalable solution for WSI analysis in clinical pathology.
Problem

Research questions and friction points this paper is trying to address.

Multiple Instance Learning
Whole Slide Image
Low-Resolution
Computational Overhead
Digital Pathology
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple Instance Learning
Knowledge Distillation
Whole Slide Image
Low-Resolution Representation
Digital Pathology