🤖 AI Summary
This work addresses the challenge of jointly optimizing response naturalness and persona consistency in personalized dialogue generation. Methodologically, we propose MUDI, a Multi-Utterance Dialogue Graph Learning framework: (1) We construct a structured dialogue graph incorporating user persona descriptions to explicitly model heterogeneous utterance-level relations; (2) We design a Dialogue Graph Attention Network (DialogueGAT) encoder coupled with a consistency-aware attention mechanism to jointly encode persona traits, dialogue structure, and generation dynamics; (3) We leverage large language models to assist utterance-relation annotation and graph construction, and integrate coherence-aware attention into decoding. Extensive evaluations on multiple benchmarks demonstrate significant improvements in response naturalness, persona consistency, and discourse coherence—yielding outputs closer to human-level quality. Our core contribution lies in the first deep integration of multi-utterance relational modeling with persona-aware graph neural networks, enabling joint representation of structured semantics and individual characteristics.
📝 Abstract
In dialogue generation, the naturalness of responses is crucial for effective human-machine interaction. Personalized response generation poses even greater challenges, as the responses must remain coherent and consistent with the user's personal traits or persona descriptions. We propose MUDI ($ extbf{Mu}$ltiple $ extbf{Di}$scourse Relations Graph Learning) for personalized dialogue generation. We utilize a Large Language Model to assist in annotating discourse relations and to transform dialogue data into structured dialogue graphs. Our graph encoder, the proposed DialogueGAT model, then captures implicit discourse relations within this structure, along with persona descriptions. During the personalized response generation phase, novel coherence-aware attention strategies are implemented to enhance the decoder's consideration of discourse relations. Our experiments demonstrate significant improvements in the quality of personalized responses, thus resembling human-like dialogue exchanges.