Continuous Thought Machines

📅 2025-05-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing deep learning models neglect the temporal dynamics and synchronization mechanisms of neuronal activity, creating a fundamental gap with biological neural computation. To bridge this gap, we propose the Continuous Thinking Machine (CTM), a novel architecture that jointly models continuous-time neural dynamics and spike-based synchronous states as differentiable latent variables. CTM employs neuron-level learnable temporal convolutions and synchronized phase encoding as its core representational primitives, integrating principles from neurodynamics modeling, continuous-time signal processing, and adaptive computation scheduling—including early-exit and extension capabilities. Crucially, it achieves both intrinsic interpretability and end-to-end differentiability. Evaluated across diverse tasks—including ImageNet-1K classification, 2D maze navigation, sorting, parity checking, question answering, and reinforcement learning—CTM demonstrates substantial improvements in complex temporal reasoning and cross-task generalization. This work establishes a new paradigm for building biologically plausible AI systems endowed with robust temporal intelligence.

Technology Category

Application Category

📝 Abstract
Biological brains demonstrate complex neural activity, where the timing and interplay between neurons is critical to how brains process information. Most deep learning architectures simplify neural activity by abstracting away temporal dynamics. In this paper we challenge that paradigm. By incorporating neuron-level processing and synchronization, we can effectively reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. The CTM has two core innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process a history of incoming signals; and (2) neural synchronization employed as a latent representation. The CTM aims to strike a balance between oversimplified neuron abstractions that improve computational efficiency, and biological realism. It operates at a level of abstraction that effectively captures essential temporal dynamics while remaining computationally tractable for deep learning. We demonstrate the CTM's strong performance and versatility across a range of challenging tasks, including ImageNet-1K classification, solving 2D mazes, sorting, parity computation, question-answering, and RL tasks. Beyond displaying rich internal representations and offering a natural avenue for interpretation owing to its internal process, the CTM is able to perform tasks that require complex sequential reasoning. The CTM can also leverage adaptive compute, where it can stop earlier for simpler tasks, or keep computing when faced with more challenging instances. The goal of this work is to share the CTM and its associated innovations, rather than pushing for new state-of-the-art results. To that end, we believe the CTM represents a significant step toward developing more biologically plausible and powerful artificial intelligence systems.
Problem

Research questions and friction points this paper is trying to address.

Incorporating neural timing to enhance information processing
Balancing biological realism with computational efficiency
Enabling complex sequential reasoning in AI systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neuron-level temporal processing with unique weights
Neural synchronization as latent representation
Adaptive compute for varying task complexity
🔎 Similar Papers
No similar papers found.
L
Luke Darlow
Sakana AI
C
Ciaran Regan
Sakana AI, University of Tsukuba
Sebastian Risi
Sebastian Risi
Professor, IT University of Copenhagen
Artificial IntelligenceNeural NetworksNeuroevolutionArtificial Life
J
Jeffrey Seely
Sakana AI
Llion Jones
Llion Jones
SakanaAI