From What to Respond to When to Respond: Timely Response Generation for Open-domain Dialogue Agents

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Existing dialogue response generation research focuses on *what* to generate, neglecting the critical temporal decision problem of *when* to respond. This paper formally introduces “timely dialogue response generation” as a novel task for open-domain conversational agents. Methodologically, we construct TimelyChat—the first temporally enhanced evaluation benchmark—and a 55K-event-driven dialogue dataset; propose a time-aware dialogue generation paradigm; and design Timer, an end-to-end model that jointly models response content and response timing. Timer integrates temporal commonsense knowledge graph mining with large language model–based data synthesis to enable response interval prediction and time-aligned generation. Experiments demonstrate that Timer significantly outperforms prompt-engineered LLMs and diverse fine-tuned baselines in both turn-level and dialogue-level evaluations. All data, models, and code are publicly released.

Technology Category

Application Category

📝 Abstract

While research on dialogue response generation has primarily focused on generating coherent responses conditioning on textual context, the critical question of when to respond grounded on the temporal context remains underexplored. To bridge this gap, we propose a novel task called timely dialogue response generation and introduce the TimelyChat benchmark, which evaluates the capabilities of language models to predict appropriate time intervals and generate time-conditioned responses. Additionally, we construct a large-scale training dataset by leveraging unlabeled event knowledge from a temporal commonsense knowledge graph and employing a large language model (LLM) to synthesize 55K event-driven dialogues. We then train Timer, a dialogue agent designed to proactively predict time intervals and generate timely responses that align with those intervals. Experimental results show that Timer outperforms prompting-based LLMs and other fine-tuned baselines in both turn-level and dialogue-level evaluations. We publicly release our data, model, and code.

Problem

Research questions and friction points this paper is trying to address.

Exploring when to respond in dialogue generation

Predicting time intervals for timely responses

Generating time-conditioned dialogue responses

Innovation

Methods, ideas, or system contributions that make the work stand out.

TimelyChat benchmark for response timing

LLM-synthesized event-driven dialogue dataset

Timer agent predicts and aligns intervals

🔎 Similar Papers

No similar papers found.