🤖 AI Summary
Medical AI models are vulnerable to adversarial attacks, yet existing pixel-level methods fail to emulate realistic clinical misdiagnoses. To address this, we propose CoRPA—the first clinical-semantic adversarial attack framework tailored for chest X-ray diagnosis. CoRPA uniquely integrates interpretable clinical concepts (e.g., “consolidation”, “pneumothorax”) into the attack process, generating adversarial images and corresponding radiology reports that mimic real-world under-diagnosis or over-diagnosis patterns via concept vector perturbation. It synergistically combines CLIP-guided diffusion-based image generation, cross-modal alignment, and black-box query optimization. Evaluated on MIMIC-CXR-JPG, CoRPA reduces the accuracy of state-of-the-art robust models by 38.7% on average—revealing critical vulnerabilities at the clinical semantic level. This work establishes a novel paradigm for safety evaluation of medical AI systems, shifting focus from low-level pixel perturbations to high-level, clinically meaningful adversarial reasoning.
📝 Abstract
Deep learning models for medical image classification tasks are becoming widely implemented in AI-assisted diagnostic tools, aiming to enhance diagnostic accuracy, reduce clinician workloads, and improve patient outcomes. However, their vulnerability to adversarial attacks poses significant risks to patient safety. Current attack methodologies use general techniques such as model querying or pixel value perturbations to generate adversarial examples designed to fool a model. These approaches may not adequately address the unique characteristics of clinical errors stemming from missed or incorrectly identified clinical features. We propose the Concept-based Report Perturbation Attack (CoRPA), a clinically-focused black-box adversarial attack framework tailored to the medical imaging domain. CoRPA leverages clinical concepts to generate adversarial radiological reports and images that closely mirror realistic clinical misdiagnosis scenarios. We demonstrate the utility of CoRPA using the MIMIC-CXR-JPG dataset of chest X-rays and radiological reports. Our evaluation reveals that deep learning models exhibiting strong resilience to conventional adversarial attacks are significantly less robust when subjected to CoRPA's clinically-focused perturbations. This underscores the importance of addressing domain-specific vulnerabilities in medical AI systems. By introducing a specialized adversarial attack framework, this study provides a foundation for developing robust, real-world-ready AI models in healthcare, ensuring their safe and reliable deployment in high-stakes clinical environments.