🤖 AI Summary
This work addresses the dual privacy risks in cloud-based AI inference—exposure of both user inputs and model weights—and tackles the impractical computational overhead of fully homomorphic encryption (FHE). To bridge this gap, the authors propose a co-design paradigm that integrates FHE with AI inference through a novel “meet-in-the-middle” optimization framework. This approach jointly tailors cryptographic primitives and neural network architectures: on one side, it customizes an FHE scheme and compiler to align with the static structure of the inference circuit; on the other, it imposes architectural constraints on the AI model to minimize dominant homomorphic operations. The resulting synergy substantially reduces FHE inference costs, offering a practical and efficient pathway toward privacy-preserving AI inference.
📝 Abstract
Modern cloud inference creates a two sided privacy problem where users reveal sensitive inputs to providers, while providers must execute proprietary model weights inside potentially leaky execution environments. Fully homomorphic encryption (FHE) offers cryptographic guarantees but remains prohibitively expensive for modern architectures. We argue that progress requires co-design where specializing FHE schemes/compilers for the static structure of inference circuits, while simultaneously constraining inference architectures to reduce dominant homomorphic cost drivers. We outline a meet in the middle agenda and concrete optimization targets on both axes.