AI Model Extraction Attacks: Bypassing Single-Client Assumptions in Defenses

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses a critical vulnerability in current AI model defense mechanisms, which predominantly rely on the single-client assumption (SCA) and thus fail to counter distributed model extraction attacks orchestrated by coordinated adversaries such as advanced persistent threats (APTs). We propose CerberusAI, a modular open-source framework that systematically exposes the fundamental limitations of SCA under multi-client collusion and advocates for a paradigm shift toward identity-agnostic, state-aware defenses. By integrating distributed query scheduling (e.g., round-robin) and adaptive traffic blending techniques, we evaluate prominent defenses like PRADA and demonstrate that even simple round-robin query distribution substantially degrades their detection efficacy, while adaptive blending strategies can effectively evade global aggregation–based defenses, rendering them ineffective.
📝 Abstract
Ensuring the protection of Artificial Intelligence (AI) models deployed in military Command and Control (C2) systems and critical infrastructure is essential for maintaining information superiority. Model Extraction Attacks (MEAs) pose a significant threat, as they enable adversaries to replicate proprietary models, compromise protected information, and prepare offline adversarial attacks. However, current defense strategies predominantly rely on the Single Client Assumption (SCA), which is the implicit assumption that attacks originate from isolated identities. This work systematically demonstrates that the SCA is fundamentally invalid in the presence of coordinated threat actors, such as Advanced Persistent Threats (APTs). We introduce a modular, open-source framework called CerberusAI for reproducible model-stealing research, and use it to simulate distributed attack scenarios. Our empirical evaluation shows that well-established defense mechanisms, such as Protecting Against Deep Neural Network Model Stealing Attacks (PRADA), can be bypassed by basic round-robin query distribution strategies, resulting in a significant reduction in detection performance. Furthermore, we demonstrate that even global aggregation approaches can be rendered operationally useless through adaptive traffic mixing. These results highlight the need for a paradigm shift towards stateful, identity-independent defense architectures in the field of model extraction attacks. This paper was originally presented at the International Conference on Military Communication and Information Systems (ICMCIS), organized by the Information Systems Technology (IST) Scientific and Technical Committee, IST-224-RSY - the ICMCIS, held in Bath, United Kingdom, 12-13 May 2026 and won the best paper award.
Problem

Research questions and friction points this paper is trying to address.

Model Extraction Attacks
Single Client Assumption
Advanced Persistent Threats
AI Model Protection
Defense Bypass
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model Extraction Attacks
Single-Client Assumption
CerberusAI
Distributed Adversarial Queries
Stateful Defense
M
Maxime Schwarzer
CortAIx Labs, Thales Deutschland, Ditzingen, Germany,Institute for Automation and Applied Informatics (IAI), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
J
Johannes F. Loevenich
CortAIx Labs, Thales Deutschland, Ditzingen, Germany,Department of Mathematics/Computer Science, University of Osnabrück, Osnabrück, Germany
G
Gustavo Sánchez
Institute for Automation and Applied Informatics (IAI), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
L
Laurin Holz
CortAIx Labs, Thales Deutschland, Ditzingen, Germany,Department of Computer Science, University of Ulm, Ulm, Germany
T
Thies Möhlenhof
CortAIx Labs, Thales Deutschland, Ditzingen, Germany,Department of Computer Science, University Koblenz-Landau, Koblenz, Germany
T
Tobias Hürten
CortAIx Labs, Thales Deutschland, Ditzingen, Germany
Roberto Rigolin F. Lopes
Roberto Rigolin F. Lopes
Scientist at Thales
Autonomous Cyber DefenseRobust AITactical NetworksDistributed Systems
Veit Hagenmeyer
Veit Hagenmeyer
KIT
energy informaticsnonlinear controlsmart grids