🤖 AI Summary
Current research on artificial consciousness is often constrained by human linguistic priors, making it difficult to discern whether observed conscious-like structures arise from task demands or anthropocentric biases. This work proposes a generative research paradigm in which language-free, self-concept-lacking agents, situated in a minimal environment and trained via multi-agent reinforcement learning, spontaneously develop communication under task pressure. By employing causally attributable mechanisms, the approach reveals consciousness-related dynamics, successfully eliciting self-referential communication behaviors. Furthermore, it identifies an echo-mismatch detection circuit driven by environmental affordances, offering empirical support for the non-predefined emergence of conscious architectures.
📝 Abstract
The question of whether artificial systems can be conscious remains open, in part because existing approaches either evaluate systems against theory-derived checklists (discriminative) or engineer consciousness-inspired modules directly (architectural); both leave open whether observed structures are artifacts of human language priors. We propose a generative methodology: emergent language (EL) in multi-agent reinforcement learning, where agents start from minimal (no language, no concept of self, minimal exposure to human text) and develop communication under task pressure alone, ensuring causal attributability to task demands rather than inherited human language priors. We position our methodology by discussing how EL serves as a generative tool for studying consciousness-relevant structure, including the role of environment complexity and the interpretation of emergent communication. As a proof of concept, we instantiate this methodology in a minimal environment and show that agents develop self-referential communication, including an echo-mismatch detection circuit that is not predicted by task structure or architecture alone but emerges from a specific environmental affordance.