GPU-accelerated Multi-relational Parallel Graph Retrieval for Web-scale Recommendations

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the fragmented modeling of user–item relationships and the inefficiency of graph-based retrieval under high GPU concurrency in Baidu’s billion-scale web-page recommendation system, this paper proposes: (1) a novel multi-relational user–item relevance metric learning framework, incorporating a self-covariance loss to jointly model heterogeneous interaction semantics—including clicks, dwell time, and conversions; and (2) a hierarchical parallel approximate nearest neighbor search (ANNS) algorithm on graphs, integrating breadth–depth-balanced navigation with GPU-adaptive graph aggregation to overcome suboptimal traversal and memory bandwidth bottlenecks. The proposed method has been deployed across 20+ core Baidu services, serving hundreds of millions of users at a throughput of 100 million queries per second (QPS). It achieves significant improvements in Recall@10 and P99 latency—demonstrating both scalability and real-world effectiveness.

Technology Category

Application Category

📝 Abstract
Web recommendations provide personalized items from massive catalogs for users, which rely heavily on retrieval stages to trade off the effectiveness and efficiency of selecting a small relevant set from billion-scale candidates in online digital platforms. As one of the largest Chinese search engine and news feed providers, Baidu resorts to Deep Neural Network (DNN) and graph-based Approximate Nearest Neighbor Search (ANNS) algorithms for accurate relevance estimation and efficient search for relevant items. However, current retrieval at Baidu fails in comprehensive user-item relational understanding due to dissected interaction modeling, and performs inefficiently in large-scale graph-based ANNS because of suboptimal traversal navigation and the GPU computational bottleneck under high concurrency. To this end, we propose a GPU-accelerated Multi-relational Parallel Graph Retrieval (GMP-GR) framework to achieve effective yet efficient retrieval in web-scale recommendations. First, we propose a multi-relational user-item relevance metric learning method that unifies diverse user behaviors through multi-objective optimization and employs a self-covariant loss to enhance pathfinding performance. Second, we develop a hierarchical parallel graph-based ANNS to boost graph retrieval throughput, which conducts breadth-depth-balanced searches on a large-scale item graph and cost-effectively handles irregular neural computation via adaptive aggregation on GPUs. In addition, we integrate system optimization strategies in the deployment of GMP-GR in Baidu. Extensive experiments demonstrate the superiority of GMP-GR in retrieval accuracy and efficiency. Deployed across more than twenty applications at Baidu, GMP-GR serves hundreds of millions of users with a throughput exceeding one hundred million requests per second.
Problem

Research questions and friction points this paper is trying to address.

Enhances user-item relational understanding
Improves GPU-based graph retrieval efficiency
Optimizes large-scale recommendation systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated parallel graph retrieval
Multi-relational relevance metric learning
Hierarchical parallel ANNS for large-scale graphs
🔎 Similar Papers
No similar papers found.
Zhuoning Guo
Zhuoning Guo
Hong Kong University of Science and Technology
Information RetrievalMultimodalGraph Learning
G
Guangxing Chen
Baidu Inc.
Q
Qian Gao
Baidu Inc.
X
Xiaochao Liao
Baidu Inc.
J
Jianjia Zheng
Baidu Inc.
Lu Shen
Lu Shen
Baidu Inc.
H
Hao Liu
The Hong Kong University of Science and Technology