No Certificate, No Execution: Certified Traces as a Foundation for Trustworthy AI Agents

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses the challenge of ensuring trustworthy AI behavior in high-stakes, heavily regulated environments, where reliance solely on generative models, output safeguards, or post-hoc audits proves insufficient to prevent unacceptable execution trajectories. To this end, the paper introduces the Proposal–Certification–Execution (PCE) framework, which formalizes trajectory permissibility as an explicit safety property requiring prior certification. Central to PCE are the Permissibility Machine and a verifiable certificate mechanism that enforce a “no certificate, no execution” policy, thereby providing pre-execution trust guarantees. Integrating a policy system Π, a language for executable trajectories, proof-carrying execution, and privacy-preserving techniques, the framework establishes a structured pre-execution certification process and advances a new evaluation paradigm centered on certifiably permissible trajectories—shifting trustworthy AI from output monitoring toward pre-execution verification.

📝 Abstract

We argue that trustworthy AI agents, especially in high-stakes and policy-governed domains, should make execution conditional on certified traces rather than rely only on stronger generative models, output-level guardrails, or post-hoc audits. A generative agent may propose recommendations, tool calls, reports, or actions, but generation is not permission: an action may be computable yet impermissible, and individually permissible actions may compose into an impermissible trace. We formalize trustworthy agency through a \textbf{Proposal--Certification--Execution (PCE)} architecture: a probabilistic generating machine $M_G$ proposes candidate execution traces, a \textbf{Permissibility Machine} $M_Π$ certifies proposed traces under a policy system $Π$, and execution proceeds only for certified traces. The executable trace language is $L_{\mathrm{exec}} = L_G \cap L_{\mathrm{cert}}(M_Π)$. Before execution, a trace is a structured pre-execution record submitted for certification: it specifies intended steps, evidence, proposed tool calls, approvals, replayable computations, credentials, and execution conditions. This perspective complements chain-of-thought monitorability: visible reasoning may help detect misbehavior, but monitorability is not certifiability, and reasoning is only one component of a broader execution trace. The formal principle is simple: an agent-generated trace should execute only when it carries a checkable certificate witnessing permissibility under $Π$: \textbf{no certificate, no execution}. We develop certified traces and Permissibility Machines as foundations for trustworthy AI agents, connect trace certification to proof-carrying execution, proof memory, privacy, and zero-knowledge certificates, and propose evaluating agents by what generated traces can be safely certified for execution, not by output accuracy alone.

Problem

Research questions and friction points this paper is trying to address.

trustworthy AI

certified traces

permissibility

execution control

policy compliance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Certified Traces

Permissibility Machine

Proposal-Certification-Execution (PCE)