🤖 AI Summary
To address the research gap concerning model inversion (MI) attacks against spiking neural networks (SNNs) in black-box settings—a critical privacy threat—this paper presents the first systematic investigation of SNN robustness to MI. We propose the first black-box MI attack framework tailored for SNNs, integrating rate-coded input transformation, spike-output decoding, and generative adversarial inversion. Experiments reveal that SNNs’ intrinsic event-driven dynamics and discrete spatiotemporal decision-making severely impede surrogate model fidelity, resulting in degraded reconstruction quality, unstable convergence, and substantially lower attack success rates. Compared to artificial neural networks (ANNs), SNNs exhibit markedly enhanced resistance to MI. This work uncovers the inherent privacy-preserving properties of SNN architectures, offering both novel insights and empirical evidence to inform the design of secure AI systems.
📝 Abstract
As machine learning models become integral to security-sensitive applications, concerns over data leakage from adversarial attacks continue to rise. Model Inversion (MI) attacks pose a significant privacy threat by enabling adversaries to reconstruct training data from model outputs. While MI attacks on Artificial Neural Networks (ANNs) have been widely studied, Spiking Neural Networks (SNNs) remain largely unexplored in this context. Due to their event-driven and discrete computations, SNNs introduce fundamental differences in information processing that may offer inherent resistance to such attacks. A critical yet underexplored aspect of this threat lies in black-box settings, where attackers operate through queries without direct access to model parameters or gradients-representing a more realistic adversarial scenario in deployed systems. This work presents the first study of black-box MI attacks on SNNs. We adapt a generative adversarial MI framework to the spiking domain by incorporating rate-based encoding for input transformation and decoding mechanisms for output interpretation. Our results show that SNNs exhibit significantly greater resistance to MI attacks than ANNs, as demonstrated by degraded reconstructions, increased instability in attack convergence, and overall reduced attack effectiveness across multiple evaluation metrics. Further analysis suggests that the discrete and temporally distributed nature of SNN decision boundaries disrupts surrogate modeling, limiting the attacker's ability to approximate the target model.