Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations

📅 2025-11-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Publicly available Transformer-based autoencoders in safety-critical domains face risks of model theft and unauthorized decoding. Method: We propose MoBLE—a model-binding authentication mechanism that requires neither secret injection nor adversarial training. MoBLE leverages an intrinsic decoder-binding property embedded in autoencoder weights, formalizing “Zero-Shot Decoding Non-Transferability” (ZSDN) as the authentication metric. It enables self-authentication via weight-space distance measurement and attention divergence analysis. Contribution/Results: Experiments show that, under identical architecture and data, MoBLE achieves >91% self-decoding accuracy, while cross-model decoding degrades to random-level performance—demonstrating robust authentication separation. This work is the first to reveal inherent model-fingerprinting characteristics in public Transformer autoencoders, establishing a lightweight, verifiable, and keyless paradigm for identity authentication and access control in secure AI systems.

Technology Category

Application Category

📝 Abstract

We introduce Model-Bound Latent Exchange (MoBLE), a decoder-binding property in Transformer autoencoders formalized as Zero-Shot Decoder Non-Transferability (ZSDN). In identity tasks using iso-architectural models trained on identical data but differing in seeds, self-decoding achieves more than 0.91 exact match and 0.98 token accuracy, while zero-shot cross-decoding collapses to chance without exact matches. This separation arises without injected secrets or adversarial training, and is corroborated by weight-space distances and attention-divergence diagnostics. We interpret ZSDN as model binding, a latent-based authentication and access-control mechanism, even when the architecture and training recipe are public: encoder's hidden state representation deterministically reveals the plaintext, yet only the correctly keyed decoder reproduces it in zero-shot. We formally define ZSDN, a decoder-binding advantage metric, and outline deployment considerations for secure artificial intelligence (AI) pipelines. Finally, we discuss learnability risks (e.g., adapter alignment) and outline mitigations. MoBLE offers a lightweight, accelerator-friendly approach to secure AI deployment in safety-critical domains, including aviation and cyber-physical systems.

Problem

Research questions and friction points this paper is trying to address.

Authenticating Transformer models using latent representations

Preventing unauthorized decoders from accessing plaintext data

Ensuring secure AI deployment in safety-critical domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-Bound Latent Exchange enables decoder-binding authentication

Zero-Shot Decoder Non-Transferability prevents cross-decoding without keys

Weight-space binding provides security without adversarial training

🔎 Similar Papers

No similar papers found.

Authors to Follow