π€ AI Summary
This work addresses the challenge that large language models often fail to strictly adhere to specified knowledge cutoff dates, particularly when questions implicitly rely on post-cutoff information. To mitigate this issue, the authors propose two recall-based prompting strategies: Self-Recall (SR) and Question-Recall (QR), which guide the model to either explicitly reaffirm temporal constraints or actively retrieve relevant pre-cutoff knowledge. As the first study to incorporate recall mechanisms for enforcing temporal boundaries, this paper introduces MHEB, a multi-year cutoff evaluation benchmark, and demonstrates the superiority of the proposed methods across three existing benchmarks and MHEB. The combination of SR and QR consistently outperforms current baselines across various cutoff years, significantly enhancing the modelβs compliance with knowledge recency constraints.
π Abstract
Prompted knowledge cutoff instructs a large language model (LLM) to act as if information beyond a specified cutoff date were unavailable. However, prior work mainly relies on direct-answer generation, which struggles when post-cutoff knowledge is not explicitly queried but is only causally related to the question. To address this limitation, we propose two recall-based prompting strategies: Self-Recall (SR), which asks the model to restate its cutoff constraint, and Question-Recall (QR), which requires the model to recall question-relevant information valid under the cutoff. Across three existing benchmarks, our methods outperform both direct-answer prompting and conventional step-by-step reasoning baselines, with particularly strong improvements on counterfactual questions. To investigate robustness across different cutoff settings, we further construct the Multi-cutoff Historical Event Benchmark (MHEB), which evaluates the same question under multiple cutoff years. Results show that knowledge cutoff performance varies with cutoff distance, while combining SR and QR consistently yields the best performance.