🤖 AI Summary
Differential privacy (DP) degrades performance on medical image datasets with few-shot and class-imbalanced settings (e.g., HAM10000), primarily due to excessive gradient clipping suppressing minority-class signals and majority-class dominance leading to suboptimal convergence. To address this, we propose Adaptive-Decay DP-SGD, a method that jointly optimizes the noise scale and gradient clipping threshold via a linear decay schedule—preserving informative minority-class gradients early in training and alleviating the tension between privacy preservation and model convergence. Additionally, we introduce a dynamic privacy budget allocation strategy tailored to class imbalance. Under ε = 3.0 and δ = 10⁻³, our method achieves a 2.15% absolute accuracy gain over Auto-DPSGD on HAM10000, significantly improving the privacy–utility trade-off in imbalanced learning scenarios.
📝 Abstract
When applying machine learning to medical image classification, data leakage is a critical issue. Previous methods, such as adding noise to gradients for differential privacy, work well on large datasets like MNIST and CIFAR-100, but fail on small, imbalanced medical datasets like HAM10000. This is because the imbalanced distribution causes gradients from minority classes to be clipped and lose crucial information, while majority classes dominate. This leads the model to fall into suboptimal solutions early. To address this, we propose SAD-DPSGD, which uses a linear decaying mechanism for noise and clipping thresholds. By allocating more privacy budget and using higher clipping thresholds in the initial training phases, the model avoids suboptimal solutions and enhances performance. Experiments show that SAD-DPSGD outperforms Auto-DPSGD on HAM10000, improving accuracy by 2.15% under $ε= 3.0$ , $δ= 10^{-3}$.