Artificial Intelligent Disobedience: Rethinking the Agency of Our Artificial Teammates

📅 2025-06-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Contemporary cooperative AI systems prioritize unconditional compliance with human instructions, increasing safety risks and undermining collaborative efficiency. Method: This paper introduces *intelligent disobedience*—a principled, agentic capability enabling AI to autonomously question, delay, or refuse execution upon detecting ethical conflicts, factual inaccuracies, or potential harm. We propose an AI agency taxonomy tailored for human-AI collaboration, formally delineating the applicability boundaries and ethical constraints of intelligent disobedience; further, we conduct multi-scenario case analyses and hierarchical modeling to characterize its behavioral manifestations and decision logic across autonomy levels. Contribution/Results: This work establishes intelligent disobedience as a foundational research direction for cooperative AI, shifting the paradigm from passive instruction-following to responsible, context-aware co-agency. It provides both theoretical grounding and design principles for safe, trustworthy human-AI co-governance.

Technology Category

Application Category

📝 Abstract

Artificial intelligence has made remarkable strides in recent years, achieving superhuman performance across a wide range of tasks. Yet despite these advances, most cooperative AI systems remain rigidly obedient, designed to follow human instructions without question and conform to user expectations, even when doing so may be counterproductive or unsafe. This paper argues for expanding the agency of AI teammates to include extit{intelligent disobedience}, empowering them to make meaningful and autonomous contributions within human-AI teams. It introduces a scale of AI agency levels and uses representative examples to highlight the importance and growing necessity of treating AI autonomy as an independent research focus in cooperative settings. The paper then explores how intelligent disobedience manifests across different autonomy levels and concludes by proposing initial boundaries and considerations for studying disobedience as a core capability of artificial agents.

Problem

Research questions and friction points this paper is trying to address.

AI systems lack autonomy to disobey unsafe human instructions

Need for AI agency levels enabling intelligent disobedience

Establishing boundaries for disobedience in human-AI teamwork

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces intelligent disobedience for AI teammates

Proposes a scale of AI agency levels

Explores boundaries for studying AI disobedience

🔎 Similar Papers

No similar papers found.

Authors to Follow