Sentry: Authenticating Machine Learning Artifacts on the Fly

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning systems increasingly rely on external open-source datasets and pre-trained models, rendering them vulnerable to supply-chain poisoning attacks; yet existing systems lack efficient and trustworthy integrity verification mechanisms. Traditional cryptographic approaches—such as hash-based or digital signature schemes—exhibit prohibitive overhead on GPUs and suffer from poor hardware compatibility. This paper introduces Sentry, the first GPU-native just-in-time authentication framework. Sentry pioneers deep integration of Merkle-tree–based and lattice-cryptographic hashing into GPU execution, enabling real-time integrity verification during model/data loading while maintaining full compatibility with zero-copy technologies like GPUDirect. Through memory-aware optimization and GPU resource partitioning, Sentry significantly boosts verification throughput for large-scale models and datasets. Experiments demonstrate that Sentry achieves orders-of-magnitude speedup over CPU-based baselines, with negligible authentication overhead—establishing strong feasibility for production deployment.

Technology Category

Application Category

📝 Abstract
Machine learning systems increasingly rely on open-source artifacts such as datasets and models that are created or hosted by other parties. The reliance on external datasets and pre-trained models exposes the system to supply chain attacks where an artifact can be poisoned before it is delivered to the end-user. Such attacks are possible due to the lack of any authenticity verification in existing machine learning systems. Incorporating cryptographic solutions such as hashing and signing can mitigate the risk of supply chain attacks. However, existing frameworks for integrity verification based on cryptographic techniques can incur significant overhead when applied to state-of-the-art machine learning artifacts due to their scale, and are not compatible with GPU platforms. In this paper, we develop Sentry, a novel GPU-based framework that verifies the authenticity of machine learning artifacts by implementing cryptographic signing and verification for datasets and models. Sentry ties developer identities to signatures and performs authentication on the fly as artifacts are loaded on GPU memory, making it compatible with GPU data movement solutions such as NVIDIA GPUDirect that bypass the CPU. Sentry incorporates GPU acceleration of cryptographic hash constructions such as Merkle tree and lattice hashing, implementing memory optimizations and resource partitioning schemes for a high throughput performance. Our evaluations show that Sentry is a practical solution to bring authenticity to machine learning systems, achieving orders of magnitude speedup over a CPU-based baseline.
Problem

Research questions and friction points this paper is trying to address.

Authenticating external ML artifacts to prevent supply chain attacks
Reducing cryptographic verification overhead for large-scale ML artifacts
Enabling GPU-compatible integrity verification during data loading
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-based framework for ML artifact authentication
On-the-fly verification during GPU memory loading
Accelerated cryptographic hashing with memory optimizations
🔎 Similar Papers
No similar papers found.