Mixed Magnification Aggregation for Generalizable Region-Level Representations in Computational Pathology

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing computational pathology methods, which predominantly rely on single 20× magnification image patches and struggle to effectively model multiscale tissue features and spatial context. To overcome this, the authors propose a region-level hybrid-magnification encoder that systematically explores fusion mechanisms across different magnification levels for the first time. By aggregating information at the region level, the method constructs efficient representations while incorporating a self-supervised pretraining strategy based on masked embedding modeling to balance multiscale contextual awareness with computational efficiency. Evaluated on biomarker prediction tasks across multiple cancer types, the approach demonstrates significant performance gains, underscoring the critical importance of multiscale spatial context in pathological analysis.

Technology Category

Application Category

📝 Abstract
In recent years, a standard computational pathology workflow has emerged where whole slide images are cropped into tiles, these tiles are processed using a foundation model, and task-specific models are built using the resulting representations. At least 15 different foundation models have been proposed, and the vast majority are trained exclusively with tiles using the 20$\times$ magnification. However, it is well known that certain histologic features can only be discerned with larger context windows and requires a pathologist to zoom in and out when analyzing a whole slide image. Furthermore, creating 224$\times$224 pixel crops at 20$\times$ leads to a large number of tiles per slide, which can be gigapixel in size. To more accurately capture multi-resolution features and investigate the possibility of reducing the number of representations per slide, we propose a region-level mixing encoder. Our approach jointly fuses image tile representations of a mixed magnification foundation model using a masked embedding modeling pretraining step. We explore a design space for pretraining the proposed mixed-magnification region aggregators and evaluate our models on transfer to biomarker prediction tasks representing various cancer types. Results demonstrate cancer dependent improvements in predictive performance, highlighting the importance of spatial context and understanding.
Problem

Research questions and friction points this paper is trying to address.

computational pathology
multi-resolution features
region-level representations
magnification
whole slide images
Innovation

Methods, ideas, or system contributions that make the work stand out.

mixed magnification
region-level representation
computational pathology
masked embedding modeling
multi-resolution fusion
🔎 Similar Papers
E
Eric Zimmermann
Microsoft Research, Cambridge, MA, United States
J
Julian Viret
Paige, NYC, NY, United States
M
Michal Zelechowski
Paige, NYC, NY, United States
J
James Brian Hall
Microsoft Research, Cambridge, MA, United States
Neil Tenenholtz
Neil Tenenholtz
Microsoft Research
A
Adam Casson
Paige, NYC, NY, United States
G
George Shaikovski
Paige, NYC, NY, United States
Eugene Vorontsov
Eugene Vorontsov
Ecole Polytechnique de Montreal
Siqi Liu
Siqi Liu
Paige AI - Tempus AI, NYC, US
Computational PathologyComputational NeurosicenceComputer VisonMedical Imaging Processing and
K
Kristen A Severson
Microsoft Research, Cambridge, MA, United States