Revealing Language Model Trajectories via Kullback-Leibler Divergence

📅 2025-05-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work investigates the internal evolutionary dynamics of language models during pretraining and fine-tuning, proposing Kullback–Leibler (KL) divergence as a metric to characterize their trajectories in log-likelihood space. Methodologically, we introduce a coordinate mapping based on log-likelihood vectors, integrate efficient KL divergence estimation, logit lens layer-wise analysis, and multi-stage checkpoint comparisons. Our key contributions are threefold: first, we discover that model evolution follows a “spiral” trajectory during pretraining and exhibits a “thread-like” inter-layer progression—previously unobserved patterns. Second, we empirically demonstrate that the diffusion exponent in log-likelihood space is significantly lower than in parameter (weight) space, indicating stronger intrinsic constraints on model evolution in the former. Third, we validate that KL divergence exhibits strong cross-architectural comparability. Collectively, these findings establish KL-based trajectories as a more robust, interpretable, and architecture-agnostic paradigm for quantifying and analyzing language model behavior.

Technology Category

Application Category

📝 Abstract

A recently proposed method enables efficient estimation of the KL divergence between language models, including models with different architectures, by assigning coordinates based on log-likelihood vectors. To better understand the behavior of this metric, we systematically evaluate KL divergence across a wide range of conditions using publicly available language models. Our analysis covers comparisons between pretraining checkpoints, fine-tuned and base models, and layers via the logit lens. We find that trajectories of language models, as measured by KL divergence, exhibit a spiral structure during pretraining and thread-like progressions across layers. Furthermore, we show that, in terms of diffusion exponents, model trajectories in the log-likelihood space are more constrained than those in weight space.

Problem

Research questions and friction points this paper is trying to address.

Estimating KL divergence between diverse language models efficiently

Analyzing model behavior via KL divergence across training stages

Comparing trajectory constraints in log-likelihood vs weight spaces

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimating KL divergence via log-likelihood vectors

Analyzing model trajectories with spiral structure

Comparing log-likelihood and weight space constraints

🔎 Similar Papers

No similar papers found.

Authors to Follow