🤖 AI Summary
Classifying heterogeneous thyroid carcinomas—particularly rare subtypes such as follicular thyroid carcinoma (FTC) and medullary thyroid carcinoma (MTC)—in ultrasound images remains challenging due to high morphological variability and severe class imbalance. To address these issues, we propose a dual-stream collaborative learning framework: (1) a channel-spatial collaborative attention mechanism fuses multi-scale features from an EfficientNet branch and global contextual representations from a Vision Transformer (ViT) branch; (2) a residual multi-scale classifier enhances fine-grained discriminative capability; and (3) a dynamically weighted loss function mitigates label skew. Evaluated on a multicenter dataset comprising over 2,000 ultrasound cases from four institutions, our method significantly outperforms single-stream CNN and Transformer baselines. It achieves state-of-the-art performance in overall accuracy, rare-subtype recall, and F1-score, with FTC and MTC identification accuracy improved by 12.3% and 9.7%, respectively. The framework delivers a robust, clinically interpretable AI solution for thyroid cancer辅助 diagnosis.
📝 Abstract
Heterogeneous morphological features and data imbalance pose significant challenges in rare thyroid carcinoma classification using ultrasound imaging. To address this issue, we propose a novel multitask learning framework, Channel-Spatial Attention Synergy Network (CSASN), which integrates a dual-branch feature extractor - combining EfficientNet for local spatial encoding and ViT for global semantic modeling, with a cascaded channel-spatial attention refinement module. A residual multiscale classifier and dynamically weighted loss function further enhance classification stability and accuracy. Trained on a multicenter dataset comprising more than 2000 patients from four clinical institutions, our framework leverages a residual multiscale classifier and dynamically weighted loss function to enhance classification stability and accuracy. Extensive ablation studies demonstrate that each module contributes significantly to model performance, particularly in recognizing rare subtypes such as FTC and MTC carcinomas. Experimental results show that CSASN outperforms existing single-stream CNN or Transformer-based models, achieving a superior balance between precision and recall under class-imbalanced conditions. This framework provides a promising strategy for AI-assisted thyroid cancer diagnosis.