$Ψ$-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the limited capacity of existing personalized language agents to proactively influence users, as most systems merely react to expressed preferences without systematic evaluation of their persuasive capabilities. To bridge this gap, we introduce Ψ-Bench, the first benchmark specifically designed to assess proactive personalized influence in large language models. Ψ-Bench features three realistic persuasion scenarios, user profiles dynamically constructed from dialogue history, and a contextualized, personality-aware evaluation framework. We evaluate ten state-of-the-art models and find that incorporating user profiles improves persuasion effectiveness by an average of 18.24%, highlighting substantial room for improvement in current models’ ability to exert intentional, personalized influence.

📝 Abstract

Personalization is a crucial capability of modern language agents. However, current research primarily positions personalized agents as passive responders to user preferences, limiting their ability to interact with users and provide suggestions or guidance proactively. To systematically evaluate such proactive personalization in realistic interactions, we propose $Ψ$-Bench, a benchmark for assessing LLMs' ability to influence realistic users through conversation. We design three real-world interaction scenarios that involve persuasion in $Ψ$-Bench, and endow simulated clients with personal characteristics through explicit user profiles derived from dialogue histories. We evaluate 10 frontier LLMs on $Ψ$-Bench and find that while most models can produce coherent and reasonable arguments, even state-of-the-art models still leave considerable room for improvement in persuasion. We also find that providing access to client profiles yields an average performance gain of 18.24\%, highlighting the importance of user-specific information for effective persuasion. Overall, our work highlights persona-sensitive influencing as a challenging yet practical direction for evaluating and developing more proactive personalized LLM agents. Codes are available at: https://github.com/Hanpx20/Psi-Bench.

Problem

Research questions and friction points this paper is trying to address.

personalization

persuasive dialogue

persona-sensitive influencing

proactive interaction

LLM evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

persona-sensitive influencing

proactive personalization

persuasive dialogue

user profile

LLM evaluation benchmark

🔎 Similar Papers

How persuadee's psychological states and traits shape digital persuasion: Lessons learnt from mobile burglary prevention encounters

2024-09-14Citations: 0

Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language

2024-06-25arXiv.orgCitations: 3