🤖 AI Summary
This work investigates whether Transformers can recall and extrapolate states of previously observed linear dynamical systems via label-based prompting in-context. Method: The authors design a synthetic dynamical systems task, employing trajectory pretraining, cross-distribution generalization experiments, edge-pruning ablations, and validation on OLMo checkpoints to dissect in-context learning (ICL) mechanisms. Contribution/Results: They identify two decoupled, sequentially emergent mechanisms underlying ICL: (i) symbolic label–guided associative recall—enabling identification and retrieval of target system states—and (ii) label-agnostic, quasi-Bayesian sequential prediction—responsible for trajectory continuation. These mechanisms exhibit distinct learning dynamics and phase transitions. The next-token prediction is shown to arise from their synergistic interaction. This dual-mechanism account explains the temporal gap between first- and second-token performance jumps in ICL-based translation and provides the first empirical evidence of universal multi-mechanism phase transitions in large language models.
📝 Abstract
We introduce a new family of toy problems that combine features of linear-regression-style continuous in-context learning (ICL) with discrete associative recall. We pretrain transformer models on sample traces from this toy, specifically symbolically-labeled interleaved state observations from randomly drawn linear deterministic dynamical systems. We study if the transformer models can recall the state of a sequence previously seen in its context when prompted to do so with the corresponding in-context label. Taking a closer look at this task, it becomes clear that the model must perform two functions: (1) identify which system's state should be recalled and apply that system to its last seen state, and (2) continuing to apply the correct system to predict the subsequent states. Training dynamics reveal that the first capability emerges well into a model's training. Surprisingly, the second capability, of continuing the prediction of a resumed sequence, develops much earlier.
Via out-of-distribution experiments, and a mechanistic analysis on model weights via edge pruning, we find that next-token prediction for this toy problem involves at least two separate mechanisms. One mechanism uses the discrete symbolic labels to do the associative recall required to predict the start of a resumption of a previously seen sequence. The second mechanism, which is largely agnostic to the discrete symbolic labels, performs a "Bayesian-style" prediction based on the previous token and the context. These two mechanisms have different learning dynamics.
To confirm that this multi-mechanism (manifesting as separate phase transitions) phenomenon is not just an artifact of our toy setting, we used OLMo training checkpoints on an ICL translation task to see a similar phenomenon: a decisive gap in the emergence of first-task-token performance vs second-task-token performance.