🤖 AI Summary
Medical image segmentation heavily relies on large-scale annotated datasets and substantial computational resources, while the impact of edge-enhancement preprocessing on cross-modal segmentation performance remains insufficiently studied. This paper systematically investigates, for the first time, the role of edge information—extracted via operators such as the Kirsch kernel—in foundation model pretraining, revealing its modality-dependent performance gains. To address this, we propose a meta-learning strategy that adaptively selects the optimal pretraining pathway (original vs. edge-enhanced) based on image standard deviation and entropy. Evaluated across seven diverse modalities—including dermoscopy, fundus, mammography, microscopy, OCT, ultrasound, and X-ray—our method achieves average Dice improvements of 16.42% and 19.30% over single-path pretraining baselines. The approach significantly enhances model generalizability and deployment flexibility without requiring additional annotations or architectural modifications.
📝 Abstract
Medical image segmentation is crucial for disease diagnosis and treatment planning, yet developing robust segmentation models often requires substantial computational resources and large datasets. Existing research shows that pre-trained and finetuned foundation models can boost segmentation performance. However, questions remain about how particular image preprocessing steps may influence segmentation performance across different medical imaging modalities. In particular, edges-abrupt transitions in pixel intensity-are widely acknowledged as vital cues for object boundaries but have not been systematically examined in the pre-training of foundation models. We address this gap by investigating to which extend pre-training with data processed using computationally efficient edge kernels, such as kirsch, can improve cross-modality segmentation capabilities of a foundation model. Two versions of a foundation model are first trained on either raw or edge-enhanced data across multiple medical imaging modalities, then finetuned on selected raw subsets tailored to specific medical modalities. After systematic investigation using the medical domains Dermoscopy, Fundus, Mammography, Microscopy, OCT, US, and XRay, we discover both increased and reduced segmentation performance across modalities using edge-focused pre-training, indicating the need for a selective application of this approach. To guide such selective applications, we propose a meta-learning strategy. It uses standard deviation and image entropy of the raw image to choose between a model pre-trained on edge-enhanced or on raw data for optimal performance. Our experiments show that integrating this meta-learning layer yields an overall segmentation performance improvement across diverse medical imaging tasks by 16.42% compared to models pre-trained on edge-enhanced data only and 19.30% compared to models pre-trained on raw data only.