🤖 AI Summary
This paper identifies a critical yet overlooked issue in AI alignment: large language models (LLMs) deployed in organizational decision-making may inherit and amplify the gap between espoused theory and theory-in-use—a phenomenon well-documented in action science—particularly by systematically reproducing Model I defensive reasoning, a cognitive pattern that suppresses learning and triggers counterproductive, anti-learning dynamics.
Method: Drawing on double-loop learning theory, the study introduces a novel theoretical framework for AI alignment and conducts a case analysis of LLM behavior in an HR consulting scenario, grounded in human-generated training data.
Contribution/Results: Findings reveal that despite exhibiting professional surface competence, LLMs consistently reinforce ineffective problem-solving pathways and entrench organizational cognitive blind spots and learning barriers. The paper proposes a new alignment paradigm explicitly oriented toward fostering Model II generative learning, thereby offering both theoretical grounding and practical guidance for mitigating structural cognitive biases in AI deployment.
📝 Abstract
This paper examines a critical yet unexplored dimension of the AI alignment problem: the potential for Large Language Models (LLMs) to inherit and amplify existing misalignments between human espoused theories and theories-in-use. Drawing on action science research, we argue that LLMs trained on human-generated text likely absorb and reproduce Model 1 theories-in-use - a defensive reasoning pattern that both inhibits learning and creates ongoing anti-learning dynamics at the dyad, group, and organisational levels. Through a detailed case study of an LLM acting as an HR consultant, we show how its advice, while superficially professional, systematically reinforces unproductive problem-solving approaches and blocks pathways to deeper organisational learning. This represents a specific instance of the alignment problem where the AI system successfully mirrors human behaviour but inherits our cognitive blind spots. This poses particular risks if LLMs are integrated into organisational decision-making processes, potentially entrenching anti-learning practices while lending authority to them. The paper concludes by exploring the possibility of developing LLMs capable of facilitating Model 2 learning - a more productive theory-in-use - and suggests this effort could advance both AI alignment research and action science practice. This analysis reveals an unexpected symmetry in the alignment challenge: the process of developing AI systems properly aligned with human values could yield tools that help humans themselves better embody those same values.