🤖 AI Summary
This work addresses the challenge of simultaneously preserving model privacy and minimizing inference latency when deploying deep neural networks (DNNs) on untrusted edge devices. The authors propose ConvShatter, a novel scheme that leverages the linearity of convolution operations to decompose kernels into critical and generic components, enabling efficient protection within a heterogeneous TEE-GPU architecture through a lightweight reconstruction mechanism. By integrating channel/kernel permutation, decoy-based obfuscation, and secure parameter storage, ConvShatter ensures strong confidentiality and integrity of the model while achieving 16% lower inference latency compared to prior approaches such as GroupCover, with negligible accuracy degradation relative to the original model.
📝 Abstract
As edge devices gain stronger computing power, deploying high-performance DNN models on untrusted hardware has become a practical approach to cut inference latency and protect user data privacy. Given high model training costs and user experience requirements, balancing model privacy and low runtime overhead is critical. TEEs offer a viable defense, and prior work has proposed heterogeneous GPU-TEE inference frameworks via parameter obfuscation to balance efficiency and confidentiality. However, recent studies find partial obfuscation defenses ineffective, while robust schemes cause unacceptable latency. To resolve these issues, we propose ConvShatter, a novel obfuscation scheme that achieves low latency and high accuracy while preserving model confidentiality and integrity. It leverages convolution linearity to decompose kernels into critical and common ones, inject confounding decoys, and permute channel/kernel orders. Pre-deployment, it performs kernel decomposition, decoy injection and order obfuscation, storing minimal recovery parameters securely in the TEE. During inference, the TEE reconstructs outputs of obfuscated convolutional layers. Extensive experiments show ConvShatter substantially reduces latency overhead with strong security guarantees; versus comparable schemes, it cuts overhead by 16% relative to GroupCover while maintaining accuracy on par with the original model.