Mapping 1,000+ Language Models via the Log-Likelihood Vector

📅 2025-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Evaluating and comparing large autoregressive language models remains computationally prohibitive due to their scale and complexity. Method: This paper proposes representing models as log-likelihood vectors over a fixed, predefined text corpus. It introduces the squared Euclidean distance in this vector space as a scalable approximation of KL divergence—formally justified via cross-entropy loss minimization—and constructs an interpretable, computationally tractable model coordinate system. Contribution/Results: Leveraging this framework, we present the first unified “language model map” encompassing 1,000+ open-source models. The map systematically reveals capability distributions, familial clustering, and evolutionary trajectories across architectures and training regimes. Theoretically grounded (distance approximates KL divergence) and empirically scalable (linear time complexity in model count), our approach establishes a new paradigm for efficient large-model evaluation, selection, and mechanistic analysis.

Technology Category

Application Category

📝 Abstract
To compare autoregressive language models at scale, we propose using log-likelihood vectors computed on a predefined text set as model features. This approach has a solid theoretical basis: when treated as model coordinates, their squared Euclidean distance approximates the Kullback-Leibler divergence of text-generation probabilities. Our method is highly scalable, with computational cost growing linearly in both the number of models and text samples, and is easy to implement as the required features are derived from cross-entropy loss. Applying this method to over 1,000 language models, we constructed a"model map,"providing a new perspective on large-scale model analysis.
Problem

Research questions and friction points this paper is trying to address.

Compare autoregressive language models
Use log-likelihood vectors for analysis
Construct scalable model map for insights
Innovation

Methods, ideas, or system contributions that make the work stand out.

Log-likelihood vectors model features
Squared Euclidean distance approximates divergence
Scalable method with linear computational cost
🔎 Similar Papers
No similar papers found.