LLM-based ambiguity detection in natural language instructions for collaborative surgical robots

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address safety risks arising from ambiguity in natural language instructions within safety-critical human-robot collaboration scenarios—such as surgical assistance—this paper proposes a large language model (LLM)-based ambiguity detection framework. Methodologically, it innovatively integrates multi-prompt ensembling, chain-of-thought reasoning, and compliance prediction to systematically identify ambiguity across three levels: lexical surface form, context dependency, and operational intent. Leveraging open-source LLMs—including Llama 3.2-11B and Gemma 3-12B—the framework enhances robustness via prompt engineering and ensemble learning. Evaluated on a surgical instruction ambiguity classification task, it achieves over 60% accuracy, substantially outperforming baseline approaches. The framework enables preoperative ambiguity alerting and supports robot-triggered adaptive safety responses, thereby establishing a novel paradigm for high-reliability human-robot collaboration.

Technology Category

Application Category

📝 Abstract
Ambiguity in natural language instructions poses significant risks in safety-critical human-robot interaction, particularly in domains such as surgery. To address this, we propose a framework that uses Large Language Models (LLMs) for ambiguity detection specifically designed for collaborative surgical scenarios. Our method employs an ensemble of LLM evaluators, each configured with distinct prompting techniques to identify linguistic, contextual, procedural, and critical ambiguities. A chain-of-thought evaluator is included to systematically analyze instruction structure for potential issues. Individual evaluator assessments are synthesized through conformal prediction, which yields non-conformity scores based on comparison to a labeled calibration dataset. Evaluating Llama 3.2 11B and Gemma 3 12B, we observed classification accuracy exceeding 60% in differentiating ambiguous from unambiguous surgical instructions. Our approach improves the safety and reliability of human-robot collaboration in surgery by offering a mechanism to identify potentially ambiguous instructions before robot action.
Problem

Research questions and friction points this paper is trying to address.

Detect ambiguity in surgical robot instructions using LLMs
Improve safety in human-robot surgical collaboration via ambiguity identification
Evaluate multiple LLMs for classifying ambiguous surgical commands
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM ensemble for surgical instruction ambiguity detection
Conformal prediction synthesizes evaluator assessments
Chain-of-thought analysis identifies instruction structure issues
🔎 Similar Papers
No similar papers found.