Memory-based Language Models: An Efficient, Explainable, and Eco-friendly Approach to Large Language Modeling

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high energy consumption, low interpretability, and heavy computational dependency of large language models (LLMs), this paper proposes a memory-based language modeling (Memory-based LM) approach that approximates k-nearest neighbor (k-NN) classification to replace deep neural networks for efficient, environmentally sustainable, and interpretable next-token prediction. The method operates entirely on CPU, leveraging fast approximate nearest neighbor retrieval and the lightweight OLIFANT system to directly model token sequence patterns in memory—enabling strong memorization capacity and fully transparent decision-making. Experimental results demonstrate that Memory-based LM achieves accuracy comparable to GPT-2 and GPT-Neo on standard language modeling benchmarks, while reducing inference latency by 47% and cutting carbon emissions by 92%. These improvements significantly enhance both sustainability and practical deployability, offering a viable alternative to resource-intensive LLMs.

Technology Category

Application Category

📝 Abstract
We present memory-based language modeling as an efficient, eco-friendly alternative to deep neural network-based language modeling. It offers log-linearly scalable next-token prediction performance and strong memorization capabilities. Implementing fast approximations of k-nearest neighbor classification, memory-based language modeling leaves a relatively small ecological footprint both in training and in inference mode, as it relies fully on CPUs and attains low token latencies. Its internal workings are simple and fully transparent. We compare our implementation of memory-based language modeling, OLIFANT, with GPT-2 and GPT-Neo on next-token prediction accuracy, estimated emissions and speeds, and offer some deeper analyses of the model.
Problem

Research questions and friction points this paper is trying to address.

Proposes efficient eco-friendly alternative to deep neural language models
Enables transparent scalable next-token prediction using CPU-based implementation
Reduces ecological footprint through fast k-nearest neighbor approximations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-based language modeling replaces deep neural networks
Uses fast k-nearest neighbor approximations for predictions
Fully CPU-based operation enables ecological efficiency
🔎 Similar Papers
No similar papers found.
Antal van den Bosch
Antal van den Bosch
Utrecht University
Computational LinguisticsDigital HumanitiesArtificial IntelligenceMachine LearningText Analytics
A
Ainhoa Risco Patón
Utrecht University
T
Teun Buijse
Utrecht University
P
Peter Berck
Lund University
M
Maarten van Gompel
Royal Netherlands Academy of Arts and Sciences