Multi-instance Learning as Downstream Task of Self-Supervised Learning-based Pre-trained Model

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional multiple instance learning (MIL) suffers significant performance degradation when instance counts in cerebral hematoma CT images surge to 256, due to heightened false correlations and sensitivity to instance scale. Method: This work pioneers the integration of self-supervised vision pretraining models—specifically DINO and MAE—as upstream pretraining paradigms for high-density medical MIL. We propose a feature-level attention-based aggregation mechanism coupled with downstream fine-tuning to mitigate spurious correlations and overcome MIL’s scalability bottleneck. Contribution/Results: On low-density sign classification, our approach improves accuracy by 5–13 percentage points and F1-score by 40–55 percentage points. It markedly enhances robustness in small-target detection and highly redundant imaging scenarios. This study establishes a scalable, self-supervised modeling framework for large-scale medical MIL, advancing practical deployment in dense-instance clinical imaging tasks.

Technology Category

Application Category

📝 Abstract
In deep multi-instance learning, the number of applicable instances depends on the data set. In histopathology images, deep learning multi-instance learners usually assume there are hundreds to thousands instances in a bag. However, when the number of instances in a bag increases to 256 in brain hematoma CT, learning becomes extremely difficult. In this paper, we address this drawback. To overcome this problem, we propose using a pre-trained model with self-supervised learning for the multi-instance learner as a downstream task. With this method, even when the original target task suffers from the spurious correlation problem, we show improvements of 5% to 13% in accuracy and 40% to 55% in the F1 measure for the hypodensity marker classification of brain hematoma CT.
Problem

Research questions and friction points this paper is trying to address.

Addressing difficulty in multi-instance learning with large instance counts
Improving accuracy in brain hematoma CT classification
Reducing spurious correlations in deep multi-instance learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised pre-trained model for multi-instance learning
Downstream task adaptation to overcome spurious correlations
Improved accuracy and F1 in hypodensity classification
🔎 Similar Papers