🤖 AI Summary
This paper identifies a novel hardware-model co-security threat—Med-Hammer—targeting Vision Transformer (ViT)-based medical imaging AI systems, wherein Rowhammer-induced memory bit flips are strategically coupled with neural trojan attacks to trigger pre-implanted trojans via precise corruption of critical weight bits, enabling stealthy misclassification or suppression of pathological regions (e.g., tumors).
Method: Leveraging ViT’s attention mechanism properties and weight sparsity, the authors model and optimize attack targeting and trigger efficiency for hardware-level weight perturbation.
Contribution/Results: Med-Hammer achieves high attack success rates (82.51%–92.56%) across ISIC, brain tumor, and MedMNIST benchmarks, while remaining undetected by state-of-the-art trojan detection methods. The work underscores the urgent need for joint hardware-algorithm robustness in clinical AI and establishes a new cross-stack (hardware–machine learning) security paradigm, backed by empirical validation.
📝 Abstract
Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT-based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain Tumor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTransformer, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and the number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms.