🤖 AI Summary
To address the problem of model providers falsifying attribute information in AI regulation, this paper proposes the first end-to-end verifiable machine learning factsheet framework built upon hardware-enforced trusted execution environments (TEEs). The framework abstracts and models critical attributes—including data provenance, model behavior, and training pipeline integrity—across the entire ML lifecycle (training and inference). It integrates TEE-based remote attestation, cryptographic signatures, and lightweight zero-knowledge proof protocols to enable efficient, scalable, and cryptographically verifiable attribute claims. The system supports mainstream ML frameworks, incurs minimal verification overhead (<5% inference latency), and scales to thousands of concurrent verifiers. Its core contribution lies in the first deep integration of TEEs into full-lifecycle ML attribute certification, significantly enhancing factsheet authenticity, integrity, and regulatory compliance.
📝 Abstract
Regulations increasingly call for various assurances from machine learning (ML) model providers about their training data, training process, and model behavior. For better transparency, industry (e.g., Huggingface and Google) has adopted model cards and datasheets to describe various properties of training datasets and models. In the same vein, we introduce the notion of inference cards to describe the properties of a given inference (e.g., binding of the output to the model and its corresponding input). We coin the term ML property cards to collectively refer to these various types of cards. To prevent a malicious model provider from including false information in ML property cards, they need to be verifiable. We show how to construct verifiable ML property cards using property attestation, technical mechanisms by which a prover (e.g., a model provider) can attest to various ML properties to a verifier (e.g., an auditor). Since prior attestation mechanisms based purely on cryptography are often narrowly focused (lacking versatility) and inefficient, we need an efficient mechanism to attest different types of properties across the entire ML model pipeline. Emerging widespread support for confidential computing has made it possible to run and even train models inside hardware-assisted trusted execution environments (TEEs), which provide highly efficient attestation mechanisms. We propose Laminator, which uses TEEs to provide the first framework for verifiable ML property cards via hardware-assisted ML property attestations. Laminator is efficient in terms of overhead, scalable to large numbers of verifiers, and versatile with respect to the properties it can prove during training or inference.