🤖 AI Summary
Large language model (LLM) agents exhibit unreliable adherence to corporate policies in business process automation. To address this, we propose a deterministic, transparent, and modular policy compliance framework comprising two phases: (1) an offline phase that compiles natural-language policy documents into verifiable guard code, and (2) a runtime phase that inserts lightweight, policy-agnostic guards before tool invocation—thereby decoupling policy enforcement from agent logic. This design enhances interpretability, maintainability, and agility in policy updates. Experiments on the τ-bench Airlines testbed demonstrate the framework’s effectiveness in intercepting policy-violating actions, validating its feasibility. However, empirical evaluation also uncovers critical deployment challenges, including incompleteness in policy coverage and difficulties in dynamically adapting guards to contextual changes. The framework thus advances policy-aware LLM agent deployment while surfacing key open issues for future work.
📝 Abstract
Large Language Model (LLM) agents hold promise for a flexible and scalable alternative to traditional business process automation, but struggle to reliably follow complex company policies. In this study we introduce a deterministic, transparent, and modular framework for enforcing business policy adherence in agentic workflows. Our method operates in two phases: (1) an offline buildtime stage that compiles policy documents into verifiable guard code associated with tool use, and (2) a runtime integration where these guards ensure compliance before each agent action. We demonstrate our approach on the challenging $τ$-bench Airlines domain, showing encouraging preliminary results in policy enforcement, and further outline key challenges for real-world deployments.