🤖 AI Summary
Current evaluation methods for conversational AI struggle to assess adherence to clinical obligations across full dialogues, particularly lacking support for the sequencing of information gathering and auditable evidence. This work proposes OIP-SCE, a novel stage-based framework for evaluating obligation compliance that translates clinical policies into shareable, executable dialogue steps. By integrating staged structural modeling, obligation-aware information tracking, and an evidence annotation mechanism, the approach operationalizes complex clinical rules while ensuring auditability. Validation in two real-world scenarios—respiratory history taking and insurance eligibility verification—demonstrates that OIP-SCE effectively aligns AI capabilities with clinical requirements, thereby enabling safe and routine deployment in regulated healthcare settings.
📝 Abstract
Conversational AI is starting to support real clinical work, but most evaluation methods miss how compliance depends on the full course of a conversation. We introduce Obligatory-Information Phase Structured Compliance Evaluation (OIP-SCE), an evaluation method that checks whether every required clinical obligation is met, in the right order, with clear evidence for clinicians to review. This makes complex rules practical and auditable, helping close the gap between technical progress and what healthcare actually needs. We demonstrate the method in two case studies (respiratory history, benefits verification) and show how phase-level evidence turns policy into shared, actionable steps. By giving clinicians control over what to check and engineers a clear specification to implement, OIP-SCE provides a single, auditable evaluation surface that aligns AI capability with clinical workflow and supports routine, safe use.