π€ AI Summary
This work addresses a critical limitation in current large language model (LLM) alignment methods, which rely on aggregated human preferences to produce a single reward signal, thereby aligning models to an artificial βaverage userβ while neglecting individual differences. To overcome this, the paper proposes replacing aggregated preferences with personalized ones as the alignment objective. Drawing on social choice theory and empirical analysis, the authors develop a bounded personalization framework that accommodates individual values and context-dependent preferences within universal safety constraints. The study systematically integrates preference modeling, personalized alignment algorithms, and safety mechanisms, revealing the information loss inherent in preference aggregation and demonstrating both the feasibility and necessity of personalized alignment. It further outlines a technical and policy roadmap that balances individual autonomy, collective safety, and ethical scalability.
π Abstract
Current approaches to aligning large language models (LLMs) aggregate diverse human preferences into a single reward signal, effectively optimizing for a hypothetical ``average user'' who represents no real person particularly well. This position paper argues that LLMs should learn personalized, individual preferences rather than aggregated ones. We show that aggregation masks critical information about preference diversity, individual values, and contextual dependencies, which is a limitation both theoretically grounded in social choice theory and empirically evident across demographic groups. We analyze the rich structure that human preferences encode, survey technical approaches to personalization, and systematically address counterarguments on scalability, shared standards, and manipulation risk. While personalization introduces genuine safety challenges including filter bubbles, value lock-in, and psychological manipulation, we argue these are manageable through bounded personalization frameworks that preserve universal safety constraints while accommodating legitimate individual variation. We conclude with a concrete research and policy agenda for developing preference-aware models that respect both individual autonomy and collective safety.