A transformer architecture alteration to incentivise externalised reasoning

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work proposes a novel training approach that integrates intermediate-layer early exiting with reinforcement learning to enable large language models to dynamically determine their inference depth based on input, thereby reducing redundant computation. By explicitly externalizing the model’s reasoning process within the Transformer architecture, the method employs reinforcement learning—used here for the first time—to optimize the early-exit policy, complemented by post-training calibration. Experiments on smaller-scale models demonstrate that this strategy significantly lowers computational overhead while preserving task performance, achieving efficient and adaptive inference.

Technology Category

Application Category

📝 Abstract

We propose a new architectural change, and post-training pipeline, for making LLMs more verbose reasoners by teaching a model to truncate forward passes early. We augment an existing transformer architecture with an early-exit mechanism at intermediate layers and train the model to exit at shallower layers when the next token can be predicted without deep computation. After a calibration stage, we incentivise the model to exit as early as possible while maintaining task performance using reinforcement learning. We provide preliminary results to this effect for small reasoning models, showing that they learn to adaptively reduce computations across tokens. We predict that, applied at the right scale, our approach can minimise the amount of excess computation that reasoning models have at their disposal to perform non-myopic planning using their internal activations, reserving this only for difficult-to-predict tokens.

Problem

Research questions and friction points this paper is trying to address.

excess computation

reasoning models

computational efficiency

token prediction

non-myopic planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

early-exit mechanism

adaptive computation

reinforcement learning