🤖 AI Summary
Current H&E-stain cell detection and classification methods exhibit poor generalizability to rare or understudied cell types, hindering comprehensive tumor microenvironment (TME) characterization. To address this, we propose HistoPLUS—a novel end-to-end framework trained on a large, manually annotated pan-cancer dataset encompassing 13 cell types, integrating nuclear segmentation and fine-grained classification modules. Our key contributions are threefold: (i) first systematic identification and characterization of seven previously intractable cell types; (ii) substantially improved cross-tumor generalization capability; and (iii) a parameter-efficient design—requiring only 20% of the parameters of state-of-the-art methods. External validation demonstrates a 5.2% improvement in detection quality, a 23.7% gain in classification F1-score, and statistically significant performance superiority over existing approaches across eight cell types. HistoPLUS establishes a new paradigm for precise, scalable TME analysis from routine histopathological images.
📝 Abstract
Cell detection, segmentation and classification are essential for analyzing tumor microenvironments (TME) on hematoxylin and eosin (H&E) slides. Existing methods suffer from poor performance on understudied cell types (rare or not present in public datasets) and limited cross-domain generalization. To address these shortcomings, we introduce HistoPLUS, a state-of-the-art model for cell analysis, trained on a novel curated pan-cancer dataset of 108,722 nuclei covering 13 cell types. In external validation across 4 independent cohorts, HistoPLUS outperforms current state-of-the-art models in detection quality by 5.2% and overall F1 classification score by 23.7%, while using 5x fewer parameters. Notably, HistoPLUS unlocks the study of 7 understudied cell types and brings significant improvements on 8 of 13 cell types. Moreover, we show that HistoPLUS robustly transfers to two oncology indications unseen during training. To support broader TME biomarker research, we release the model weights and inference code at https://github.com/owkin/histoplus/.