Mapping 1,000+ Language Models via the Log-Likelihood Vector

📅 2025-02-22

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Evaluating and comparing large autoregressive language models remains computationally prohibitive due to their scale and complexity. Method: This paper proposes representing models as log-likelihood vectors over a fixed, predefined text corpus. It introduces the squared Euclidean distance in this vector space as a scalable approximation of KL divergence—formally justified via cross-entropy loss minimization—and constructs an interpretable, computationally tractable model coordinate system. Contribution/Results: Leveraging this framework, we present the first unified “language model map” encompassing 1,000+ open-source models. The map systematically reveals capability distributions, familial clustering, and evolutionary trajectories across architectures and training regimes. Theoretically grounded (distance approximates KL divergence) and empirically scalable (linear time complexity in model count), our approach establishes a new paradigm for efficient large-model evaluation, selection, and mechanistic analysis.

Technology Category

Application Category

📝 Abstract

To compare autoregressive language models at scale, we propose using log-likelihood vectors computed on a predefined text set as model features. This approach has a solid theoretical basis: when treated as model coordinates, their squared Euclidean distance approximates the Kullback-Leibler divergence of text-generation probabilities. Our method is highly scalable, with computational cost growing linearly in both the number of models and text samples, and is easy to implement as the required features are derived from cross-entropy loss. Applying this method to over 1,000 language models, we constructed a"model map,"providing a new perspective on large-scale model analysis.

Problem

Research questions and friction points this paper is trying to address.

Compare autoregressive language models

Use log-likelihood vectors for analysis

Construct scalable model map for insights

Innovation

Methods, ideas, or system contributions that make the work stand out.

Log-likelihood vectors model features

Squared Euclidean distance approximates divergence

Scalable method with linear computational cost

🔎 Similar Papers

Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations