Towards Practical Large-scale Dynamical Heterogeneous Graph Embedding: Cold-start Resilient Recommendation

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address scalability, data freshness, and cold-start challenges in industrial deployment of large-scale dynamic heterogeneous graph embedding, this paper proposes a two-stage framework integrating static global modeling with real-time incremental updates. The method introduces HetSGFormer—a linearly scalable heterogeneous graph Transformer for global structural modeling—and ILLE, a lightweight incremental local linear embedding algorithm enabling millisecond-level local updates on billion-node graphs. Further, CPU-native incremental optimization and heterogeneous graph structure encoding are incorporated to enhance cold-start robustness. A/B testing on a billion-scale industrial graph demonstrates that HetSGFormer increases advertiser value by 6.11%; ILLE delivers an additional 3.22% gain; and embedding refresh latency is reduced by 83.2%, significantly improving temporal responsiveness.

Technology Category

Application Category

📝 Abstract
Deploying dynamic heterogeneous graph embeddings in production faces key challenges of scalability, data freshness, and cold-start. This paper introduces a practical, two-stage solution that balances deep graph representation with low-latency incremental updates. Our framework combines HetSGFormer, a scalable graph transformer for static learning, with Incremental Locally Linear Embedding (ILLE), a lightweight, CPU-based algorithm for real-time updates. HetSGFormer captures global structure with linear scalability, while ILLE provides rapid, targeted updates to incorporate new data, thus avoiding costly full retraining. This dual approach is cold-start resilient, leveraging the graph to create meaningful embeddings from sparse data. On billion-scale graphs, A/B tests show HetSGFormer achieved up to a 6.11% lift in Advertiser Value over previous methods, while the ILLE module added another 3.22% lift and improved embedding refresh timeliness by 83.2%. Our work provides a validated framework for deploying dynamic graph learning in production environments.
Problem

Research questions and friction points this paper is trying to address.

Develops cold-start resilient recommendation using dynamic heterogeneous graph embeddings
Addresses scalability and data freshness challenges in production graph learning systems
Balances deep graph representation with low-latency incremental updates for billion-scale graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage framework balances deep representation with incremental updates
HetSGFormer transformer provides linear scalability for global structure
ILLE algorithm enables real-time, CPU-based updates without full retraining
🔎 Similar Papers
No similar papers found.
M
Mabiao Long
Department of Computer Science and Engineering, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
J
Jiaxi Liu
Department of Computer Science and Engineering, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
Yufeng Li
Yufeng Li
East China Normal University
Artificial Intelligence
H
Hao Xiong
Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, China
Junchi Yan
Junchi Yan
FIAPR & ICML Board Member, SJTU (2018-), SII (2024-), AWS (2019-2022), IBM (2011-2018)
Computational IntelligenceAI4ScienceMachine LearningAutonomous Driving
K
Kefan Wang
independent researchers not affiliated with any institution
Y
Yi Cao
independent researchers not affiliated with any institution
J
Jiandong Ding
independent researchers not affiliated with any institution