On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This work addresses the challenge of efficiently constructing and managing massive numbers of persistent personalized models atop trillion-parameter foundation models. It proposes leveraging parameter-efficient fine-tuning (PEFT) as a lightweight and reliable personalization substrate, combining a shared large model with small, trainable adapters to encode user preferences, skills, and memory. The authors introduce MinT, an infrastructure that integrates adapter identity management, version control, provenance tracking, evaluation, and serving mechanisms, and define three scaling dimensions: Scale Up, Scale Down, and Scale Out. Experimental results demonstrate that, even under strong shared priors, compact adapters can stably capture personalized behaviors, offering a viable pathway toward large-scale deployment of millions of persistent personal models.

📝 Abstract

Parameter-efficient fine-tuning (PEFT) is usually treated as a cheaper alternative to full fine-tuning. We study a broader role: small trainable adapters as persistent local state on top of strong shared foundation models. In this framing, the base model provides shared competence while adapters carry instance-specific behavior such as preferences, skills, tool habits, and memory-like updates. We organize the problem around three scaling axes: Scale Up, where stronger shared priors make small local updates more useful; Scale Down, where we study how small adapters can be while remaining reliable; and Scale Out, where many persistent adapted instances coexist. MinT provides one infrastructure example for managing adapter identity, revision, provenance, evaluation, and serving residency. Together, the results suggest that PEFT can be a compact substrate for persistent personal models rather than only a budget substitute for full fine-tuning.

Problem

Research questions and friction points this paper is trying to address.

Parameter-efficient fine-tuning

Personal models

Adapter scaling

Foundation models

Persistent adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-Efficient Fine-Tuning

Personalized Models

Adapter Scaling