π€ AI Summary
Existing FOLTR benchmarks predominantly rely on static datasets with random splits and synchronous assumptions, failing to capture the dynamic and asynchronous nature of real-world search scenarios. To address this, we introduce AOL4FOLTRβthe first large-scale, realistic search dataset tailored for Federated Online Learning to Rank (FOLTR). Built from 2.6 million queries and click logs of 10,000 users, it preserves temporal stamps and user identifiers, enabling fine-grained user partitioning and asynchronous federated training. We further propose an end-to-end FOLTR framework integrating realistic log preprocessing, sequential user behavior modeling, and privacy-preserving distributed model aggregation. Our approach jointly ensures modeling fidelity and data privacy. AOL4FOLTR and the accompanying framework significantly enhance experimental realism and reproducibility in FOLTR research, establishing a critical infrastructure for privacy-aware search ranking.
π Abstract
The centralized collection of search interaction logs for training ranking models raises significant privacy concerns. Federated Online Learning to Rank (FOLTR) offers a privacy-preserving alternative by enabling collaborative model training without sharing raw user data. However, benchmarks in FOLTR are largely based on random partitioning of classical learning-to-rank datasets, simulated user clicks, and the assumption of synchronous client participation. This oversimplifies real-world dynamics and undermines the realism of experimental results. We present AOL4FOLTR, a large-scale web search dataset with 2.6 million queries from 10,000 users. Our dataset addresses key limitations of existing benchmarks by including user identifiers, real click data, and query timestamps, enabling realistic user partitioning, behavior modeling, and asynchronous federated learning scenarios.