🤖 AI Summary
Computational metacognition has long suffered from theoretical fragmentation, terminological inconsistency, and architectural incomparability, impeding systematic analysis and benchmarking. To address this, we conduct a comprehensive survey of 35 computational metacognitive architectures (CMAs), introducing the first unified analytical framework grounded in Flavell’s tripartite metacognitive model—metacognitive knowledge, experience, and regulation. We rigorously examine how each CMA represents metacognitive experiences (e.g., introspective traces, arousal indicators), implements them algorithmically, and realizes functional benefits. We propose a dual-track modeling paradigm—symbolic and sub-symbolic—for metacognitive experience. Empirical evidence demonstrates that integrating such experience significantly enhances system adaptability, interpretability, and task performance. Crucially, we identify the absence of shared evaluation benchmarks as the primary bottleneck hindering progress. Our work advances standardization and enables principled cross-architectural comparison in computational metacognition.
📝 Abstract
Inspired by human cognition, metacognition has gained significant attention for its potential to enhance autonomy, adaptability, and robust learning in artificial agents. Yet research on Computational Metacognitive Architectures (CMAs) remains fragmented: diverse theories, terminologies, and design choices have led to disjointed developments and limited comparability across systems. Existing overviews and surveys often remain at a broad, conceptual level, making it difficult to synthesize deeper insights into the underlying algorithms and representations, and their respective success. We address this gap by performing an explorative systematic review of how CMAs model, store, remember and process their metacognitive experiences, one of Flavell's (1979) three foundational components of metacognition. Following this organizing principle, we identify 35 CMAs that feature episodic introspective data ranging from symbolic event traces to sub-symbolic arousal metrics. We consider different aspects - ranging from the underlying psychological theories to the content and structure of collected data, to the algorithms used and evaluation results - and derive a unifying perspective that allows us to compare in depth how different Computational Metacognitive Architectures (CMAs) leverage metacognitive experiences for tasks such as error diagnosis, self-repair, and goal-driven learning. Our findings highlight both the promise of metacognitive experiences - in boosting adaptability, explainability, and overall system performance - and the persistent lack of shared standards or evaluation benchmarks.