π€ AI Summary
To address the high inference latency, substantial memory overhead, and difficulty in balancing security and practicality in fully homomorphic encryption (FHE)-based private inference, this paper proposes TT-TFHE: a lightweight neural network architecture co-designed for Truth-Table Neural Networks (TTnet) and Torus-based FHE (TFHE), supporting both tabular and image data. TT-TFHE is the first TFHE-based private inference framework achieving 128-bit security with sub-second end-to-end inference latency and memory consumption of only tens of megabytes. Leveraging automated compilation and lookup-table (LUT) optimization via the Concrete library, it significantly reduces computational and memory costs. On tabular benchmarks, TT-TFHE outperforms all existing HE-based approaches. On MNIST and CIFAR-10, it surpasses state-of-the-art TFHE methods in both inference speed and accuracy, while reducing memory usage by two to three orders of magnitude compared to conventional FHE schemes.
π Abstract
This paper presents TT-TFHE, a deep neural network Fully Homomorphic Encryption (FHE) framework that effectively scales Torus FHE (TFHE) usage to tabular and image datasets using a recent family of convolutional neural networks called Truth-Table Neural Networks (TTnet). The proposed framework provides an easy-to-implement, automated TTnet-based design toolbox with an underlying (python-based) open-source Concrete implementation (CPU-based and implementing lookup tables) for inference over encrypted data. Experimental evaluation shows that TT-TFHE greatly outperforms in terms of time and accuracy all Homomorphic Encryption (HE) set-ups on three tabular datasets, all other features being equal. On image datasets such as MNIST and CIFAR-10, we show that TT-TFHE consistently and largely outperforms other TFHE set-ups and is competitive against other HE variants such as BFV or CKKS (while maintaining the same level of 128-bit encryption security guarantees). In addition, our solutions present a very low memory footprint (down to dozens of MBs for MNIST), which is in sharp contrast with other HE set-ups that typically require tens to hundreds of GBs of memory per user (in addition to their communication overheads). This is the first work presenting a fully practical solution of private inference (i.e. a few seconds for inference time and a few dozens MBs of memory) on both tabular datasets and MNIST, that can easily scale to multiple threads and users on server side.