🤖 AI Summary
This work addresses the insufficient modeling of interest drift and feedback accumulation in dynamic reading scenarios for scientific paper recommendation by proposing PaperFlow, a framework that models users’ daily paper streams through three tightly coupled stages: Profiling, Recommending, and Adapting. The core contributions include constructing the first longitudinal user–daily-granularity paper recommendation benchmark; introducing an interpretable scholarly profiling method, a multi-signal aggregation recommendation mechanism, and a semantics-aware feedback-driven model of interest evolution; and effectively integrating heterogeneous cold-start evidence to enable efficient ranking under a fixed display budget. Experimental results demonstrate that PaperFlow significantly outperforms five baselines on a benchmark comprising 24 simulated users and 1,200 user–day segments, achieving state-of-the-art performance in ranking accuracy, behavioral alignment, and expert blind evaluation.
📝 Abstract
Scientific paper recommendation is typically evaluated as static ranking over a fixed candidate set, yet real scientific reading unfolds as a daily, longitudinal process in which interests shift and feedback accumulates. We introduce PaperFlow, a framework that organizes it into three coupled stages: Profiling, which constructs and maintains a structured, inspectable scholarly profile from heterogeneous cold-start evidence; Recommending, which ranks each date-specific paper stream through multi-signal aggregation under a fixed display budget; and Adapting, which updates user state from semantically distinct feedback signals and models interest drift across days. We further define a longitudinal user-day benchmark that fixes users, dates, candidate pools, visible inputs, and hidden simulated relevance labels under a shared temporal information boundary. The benchmark contains 24 simulated research users, 50 daily paper streams, 1,200 user-day episodes, 20,727 unique papers, and 497,448 episode-paper records. We additionally specify a blind human-evaluation protocol to validate alignment between automatic metrics and expert judgments. Experiments against five scientific recommendation baselines show that PaperFlow achieves the strongest oracle-based ranking, the highest behavioral alignment with simulated reading selections, and the best blind human-evaluation score.