PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification

📅 2025-03-20

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

To address the limitation of conventional multiple-instance learning (MIL) in whole-slide image (WSI) classification—namely, its neglect of spatial relationships among patches, leading to inadequate tissue structure modeling—this paper proposes a probabilistic spatial attention mechanism. Specifically, it models inter-patch spatial dependencies via a learnable distance-decay prior; introduces a posterior spatial pruning strategy for data-driven, context-adaptive feature fusion; and designs a multi-head attention diversity loss to mitigate redundant attention. The method preserves linear computational complexity while substantially enhancing spatial representation capability. Evaluated on multiple WSI benchmarks, it achieves state-of-the-art performance, significantly outperforming both non-spatial and mainstream spatial-context baselines. These results underscore the critical role of explicit, learnable, and adaptive spatial priors in improving diagnostic accuracy for digital pathology.

Technology Category

Application Category

📝 Abstract

Whole Slide Images (WSIs) are high-resolution digital scans widely used in medical diagnostics. WSI classification is typically approached using Multiple Instance Learning (MIL), where the slide is partitioned into tiles treated as interconnected instances. While attention-based MIL methods aim to identify the most informative tiles, they often fail to fully exploit the spatial relationships among them, potentially overlooking intricate tissue structures crucial for accurate diagnosis. To address this limitation, we propose Probabilistic Spatial Attention MIL (PSA-MIL), a novel attention-based MIL framework that integrates spatial context into the attention mechanism through learnable distance-decayed priors, formulated within a probabilistic interpretation of self-attention as a posterior distribution. This formulation enables a dynamic inference of spatial relationships during training, eliminating the need for predefined assumptions often imposed by previous approaches. Additionally, we suggest a spatial pruning strategy for the posterior, effectively reducing self-attention's quadratic complexity. To further enhance spatial modeling, we introduce a diversity loss that encourages variation among attention heads, ensuring each captures distinct spatial representations. Together, PSA-MIL enables a more data-driven and adaptive integration of spatial context, moving beyond predefined constraints. We achieve state-of-the-art performance across both contextual and non-contextual baselines, while significantly reducing computational costs.

Problem

Research questions and friction points this paper is trying to address.

Exploiting spatial relationships in WSI classification

Reducing computational complexity in attention mechanisms

Enhancing spatial context integration for accurate diagnosis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates spatial context via learnable distance-decayed priors

Uses probabilistic self-attention for dynamic spatial relationship inference

Introduces diversity loss for varied spatial representation capture

🔎 Similar Papers

Position: From Correlation to Causation: Max-Pooling-Based Multi-Instance Learning Leads to More Robust Whole Slide Image Classification