🤖 AI Summary
Current digital pathology cell recognition methods rely heavily on extensive manual annotations and fixed category taxonomies, resulting in limited generalizability and adaptability. To address these limitations, we propose the first unified vision Transformer framework for general-purpose cell segmentation and classification. Our approach comprises three key innovations: (1) a lightweight adapter module integrated with a foundation model encoder, enabling zero-shot segmentation and few-shot recognition of novel cell types; (2) a novel immunofluorescence-guided synthetic annotation technique that achieves high-quality, unsupervised automatic labeling—substantially reducing annotation cost and carbon footprint; and (3) state-of-the-art zero-shot segmentation performance across seven diverse, multi-organ, multi-clinical-scenario datasets, with our automated annotation model surpassing human annotation baselines. The fully open-sourced, production-ready framework is publicly available.
📝 Abstract
Digital Pathology is a cornerstone in the diagnosis and treatment of diseases. A key task in this field is the identification and segmentation of cells in hematoxylin and eosin-stained images. Existing methods for cell segmentation often require extensive annotated datasets for training and are limited to a predefined cell classification scheme. To overcome these limitations, we propose $ ext{CellViT}^{{scriptscriptstyle ++}}$, a framework for generalized cell segmentation in digital pathology. $ ext{CellViT}^{{scriptscriptstyle ++}}$ utilizes Vision Transformers with foundation models as encoders to compute deep cell features and segmentation masks simultaneously. To adapt to unseen cell types, we rely on a computationally efficient approach. It requires minimal data for training and leads to a drastically reduced carbon footprint. We demonstrate excellent performance on seven different datasets, covering a broad spectrum of cell types, organs, and clinical settings. The framework achieves remarkable zero-shot segmentation and data-efficient cell-type classification. Furthermore, we show that $ ext{CellViT}^{{scriptscriptstyle ++}}$ can leverage immunofluorescence stainings to generate training datasets without the need for pathologist annotations. The automated dataset generation approach surpasses the performance of networks trained on manually labeled data, demonstrating its effectiveness in creating high-quality training datasets without expert annotations. To advance digital pathology, $ ext{CellViT}^{{scriptscriptstyle ++}}$ is available as an open-source framework featuring a user-friendly, web-based interface for visualization and annotation. The code is available under https://github.com/TIO-IKIM/CellViT-plus-plus.