๐ค AI Summary
Autonomous AI agents exhibit unpredictable emergent behaviors that invalidate conventional verification techniques. Method: This paper introduces a novel dynamic probabilistic assurance paradigm, which constructs formal event abstractions and Markov decision process (MDP) state models from runtime I/O observations, integrating online learning with probabilistic model checking to enable real-time, continuous, quantitative verification of agent behavior. Contribution/Results: The key innovation lies in embedding the dynamic probabilistic assurance mechanism directly into the runtime monitoring layerโenabling persistent evaluation of system failure probability under operational constraints. Experimental evaluation demonstrates that the framework delivers provably safe guarantees across diverse scenarios, while maintaining high responsiveness and strong robustness.
๐ Abstract
The rapid evolution to autonomous, agentic AI systems introduces significant risks due to their inherent unpredictability and emergent behaviors; this also renders traditional verification methods inadequate and necessitates a shift towards probabilistic guarantees where the question is no longer if a system will fail, but the probability of its failure within given constraints. This paper presents AgentGuard, a framework for runtime verification of Agentic AI systems that provides continuous, quantitative assurance through a new paradigm called Dynamic Probabilistic Assurance. AgentGuard operates as an inspection layer that observes an agent's raw I/O and abstracts it into formal events corresponding to transitions in a state model. It then uses online learning to dynamically build and update a Markov Decision Process (MDP) that formally models the agent's emergent behavior. Using probabilistic model checking, the framework then verifies quantitative properties in real-time.