🤖 AI Summary
Contemporary conversational AI systems lack explicit modeling of individual user differences, hindering dynamic adaptation of both content and stylistic elements. To address this, we propose a reinforcement learning–based personalized dialogue management framework. First, we leverage GPT to generate diverse synthetic personas for policy pre-training. Second, we construct an online user profile that fuses implicit behavioral signals—including typing speed, dwell time, and affective responses—with explicit feedback, and optimize the dialogue policy in real time via multi-step reinforcement learning. Third, we introduce a fine-grained dynamic policy model enabling dual-dimensional adaptation—both stylistic and content-based. Evaluated across 50 synthetic personas, our system achieves significant improvements in user satisfaction (+28.6%), personalization accuracy (+31.4%), and task completion rate (+24.9%), with large effect sizes and high statistical significance (p < 0.001).
📝 Abstract
Current conversational AI systems often provide generic, one-size-fits-all interactions that overlook individual user characteristics and lack adaptive dialogue management. To address this gap, we introduce extbf{HumAIne-chatbot}, an AI-driven conversational agent that personalizes responses through a novel user profiling framework. The system is pre-trained on a diverse set of GPT-generated virtual personas to establish a broad prior over user types. During live interactions, an online reinforcement learning agent refines per-user models by combining implicit signals (e.g. typing speed, sentiment, engagement duration) with explicit feedback (e.g., likes and dislikes). This profile dynamically informs the chatbot dialogue policy, enabling real-time adaptation of both content and style. To evaluate the system, we performed controlled experiments with 50 synthetic personas in multiple conversation domains. The results showed consistent improvements in user satisfaction, personalization accuracy, and task achievement when personalization features were enabled. Statistical analysis confirmed significant differences between personalized and nonpersonalized conditions, with large effect sizes across key metrics. These findings highlight the effectiveness of AI-driven user profiling and provide a strong foundation for future real-world validation.