Beyond Retrieval: Learning Compact User Representations for Scalable LLM Personalization

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
Personalization of large language models (LLMs) is often hindered by strong reliance on retrieval, sensitivity to prompting, or the high storage overhead of user-specific adapters, making it difficult to balance effectiveness and scalability. This work proposes TAP-PER, a novel framework that adapts user modeling techniques from recommender systems to LLMs by introducing lightweight, learnable user-state prefix embeddings to encode preferences—eliminating the need for explicit prompt engineering or heavy-weight adapters. Coupled with a temporal attention mechanism to capture evolving interests, TAP-PER achieves highly efficient personalization with minimal parameters. It consistently outperforms existing prompt-level and model-level baselines across six LaMP tasks. At a scale of one thousand users, TAP-PER reduces per-user parameters by 130× compared to OPPU and requires only about half the total parameters of PER-PCS.
📝 Abstract
Personalizing large language models requires adapting model behavior to individual users while preserving robustness and deployment-scale efficiency. Existing approaches typically personalize LLMs either at the input level, by retrieving user histories or constructing profile prompts, or at the parameter level, by maintaining user-specific parameter-efficient modules. The former makes personalization sensitive to retrieval quality and prompt design, whereas the latter incurs storage and maintenance costs that grow with the user population. To address these limitations, we propose TAP-PER (Temporal Attentive Prefix for PERsonalization), a prefix-based framework that encodes user preferences as learnable representations, eliminating explicit prompt construction and replacing heavy per-user adapters with lightweight user-state prefix embeddings. Inspired by personalized recommendation systems, TAP-PER decomposes user modeling into user-state and query-conditioned components, and incorporates temporal signals to capture the evolving nature of user interests. Experiments on six LaMP tasks show that TAP-PER consistently outperforms prompt-based and model-based baselines across classification, rating, and generation settings. Moreover, TAP-PER uses 130x fewer per-user parameters than OPPU and roughly half the total parameter footprint of PER-PCS at the 1,000-user scale, demonstrating that scalable LLM personalization can be achieved without explicit prompt construction or heavy per-user adapters.
Problem

Research questions and friction points this paper is trying to address.

LLM personalization
user representation
scalability
parameter efficiency
retrieval-based personalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

compact user representation
prefix-based personalization
temporal modeling
parameter efficiency
scalable LLM personalization