🤖 AI Summary
To address poor generalization of medical image models arising from heterogeneous data sources and inconsistent annotation standards, this paper proposes the first deep network architecture embedding a learnable Quaternionic Wavelet Transform (QWT). Our method jointly models multi-scale, multi-directional, and phase-structural information via differentiable QWT layers, complex/quaternionic convolutions, and multi-scale feature fusion modules integrated into a CNN backbone—enabling rotation- and scale-robust representation learning. Crucially, we present the first empirical validation in medical imaging that QWT-domain disentangled representations enhance downstream task generalization. Coupled with self-supervised contrastive pretraining, our model achieves an average 4.2% improvement in generalization accuracy across five cross-domain segmentation and classification benchmarks. It also demonstrates significantly superior few-shot transfer performance compared to ResNet, ViT, and Wave-CNN baselines.