🤖 AI Summary
This study investigates the mechanistic link between scaling laws in large model behavior and the emergence of internal representational structure. By training small Transformers on a controlled sequence modeling task—predicting outputs from a hidden Markov model—and employing residual activation linear encodings alongside probability simplex probing techniques, the work reveals for the first time a predictable correspondence between performance scaling with model size and the geometric evolution of internal belief distributions. This finding provides crucial empirical evidence for understanding the intrinsic mechanisms underlying scaling laws and underscores the central role of internal representational geometry in the qualitative leaps of model capabilities.
📝 Abstract
Modern large-scale deep learning exhibits two striking empirical phenomena: behavioural scaling laws (predictable performance gains with increasing scale) and emergent mechanisms (structured internal representations and circuits in deep neural networks). We hypothesise that these two phenomena are connected: that predictable changes in behaviour are the result of predictable changes in internal computational structure. In this paper, we report preliminary evidence of such a connection. We find a correlation between scaling patterns in performance and representations in small transformers trained to predict the outputs of a hidden Markov model, for which residual activations are known to linearly encode a belief distribution over latent states in a probability simplex.