🤖 AI Summary
This work addresses a critical security vulnerability in deploying large language models (LLMs) on third-party devices, where existing approaches often rely on partially trusted execution environments (TEEs) and precomputed noise to protect intellectual property. We demonstrate for the first time that the common practice of statically reusing secret bases—introduced to accelerate encrypted inference in mainstream LLMs—leaks key information. Leveraging cryptanalysis, linear algebraic reconstruction, and adversarial queries, we develop two novel attacks that reverse-engineer TEE protocols and encrypted inference pipelines. Our methods recover secret parameters from a single layer of the LLaMA-3 8B model within six minutes and scale effectively to models as large as 405B, entirely bypassing integrity checks employed by systems such as Soter and TSQP.
📝 Abstract
The deployment of large language models (LLMs) on third-party devices requires new ways to protect model intellectual property. While Trusted Execution Environments (TEEs) offer a promising solution, their performance limits can lead to a critical compromise: using a precomputed, static secret basis to accelerate cryptographic operations. We demonstrate that this mainstream design pattern introduces a classic cryptographic flaw, the reuse of secret keying material, into the system's protocol. We prove its vulnerability with two distinct attacks: First, our attack on a model confidentiality system achieves a full confidentiality break by recovering its secret permutations and model weights. Second, our integrity attack completely bypasses the integrity checks of systems like Soter and TSQP. We demonstrate the practicality of our attacks against state-of-the-art LLMs, recovering a layer's secrets from a LLaMA-3 8B model in about 6 minutes and showing the attack scales to compromise 405B-parameter LLMs across a variety of configurations.