PAD: Personalized Alignment of LLMs at Decoding-Time

📅 2024-10-05
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Existing alignment methods for large language models (LLMs) struggle to accommodate users’ diverse cultural, educational, and political preferences during inference, and typically require costly post-hoc training. Method: We propose PAD (Personalized Alignment at Decoding), a training-free, decoding-time personalization framework. PAD introduces the first token-level, zero-shot, cross-model generalizable personalized reward modeling technique—decoupling preference representation from text generation—and integrates it with constrained beam search and logit correction for reward-guided decoding. It further incorporates multi-preference prompt engineering and a plug-and-play architecture. Results: PAD outperforms supervised fine-tuning (SFT) and PPO-based alignment methods across multiple preference alignment benchmarks, exhibits strong generalization to unseen preferences, and is compatible with mainstream open-source base models including Llama, Qwen, and Phi.

Technology Category

Application Category

📝 Abstract
Aligning with personalized preferences, which vary significantly across cultural, educational, and political differences, poses a significant challenge due to the computational costs and data demands of traditional alignment methods. In response, this paper presents Personalized Alignment at Decoding-time (PAD), a novel framework designed to align LLM outputs with diverse personalized preferences during the inference phase, eliminating the need for additional training. By introducing a unique personalized reward modeling strategy, this framework decouples the text generation process from personalized preferences, facilitating the generation of generalizable token-level personalized rewards. The PAD algorithm leverages these rewards to guide the decoding process, dynamically tailoring the base model's predictions to personalized preferences. Extensive experimental results demonstrate that PAD not only outperforms existing training-based alignment methods in terms of aligning with diverse preferences but also shows significant generalizability to preferences unseen during training and scalability across different base models. This work advances the capability of LLMs to meet user needs in real-time applications, presenting a substantial step forward in personalized LLM alignment.
Problem

Research questions and friction points this paper is trying to address.

Aligns LLM outputs with diverse personalized preferences during inference.
Eliminates need for additional training by using personalized reward modeling.
Enhances generalizability and scalability across different base models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Personalized Alignment at Decoding-time (PAD) framework
Decouples text generation from personalized preferences
Dynamic decoding guided by personalized rewards
🔎 Similar Papers
No similar papers found.