🤖 AI Summary
To address inherent limitations of cloud-centric large language models—namely high latency, excessive computational cost, weak personalization, and privacy risks—this paper proposes a novel edge-cloud collaborative learning paradigm. It jointly optimizes lightweight on-device models with powerful cloud-based large models to achieve low-latency inference, cost efficiency, strong personalization, and end-to-end privacy preservation. Methodologically, it introduces the first systematic three-tier collaboration framework—spanning data, feature, and parameter levels—and designs dual-granularity evaluation metrics (user-level and device-level), alongside a practical deployment roadmap for real-world scenarios. Technically, the paradigm integrates edge computing, federated learning, knowledge distillation, model partitioning, and hardware-aware optimization across the full stack. The paper also provides a comprehensive survey of academic and industrial advances, synthesizing mainstream algorithms, public datasets, and benchmarking standards, and identifies six key open research directions.
📝 Abstract
The conventional cloud-based large model learning framework is increasingly constrained by latency, cost, personalization, and privacy concerns. In this survey, we explore an emerging paradigm: collaborative learning between on-device small model and cloud-based large model, which promises low-latency, cost-efficient, and personalized intelligent services while preserving user privacy. We provide a comprehensive review across hardware, system, algorithm, and application layers. At each layer, we summarize key problems and recent advances from both academia and industry. In particular, we categorize collaboration algorithms into data-based, feature-based, and parameter-based frameworks. We also review publicly available datasets and evaluation metrics with user-level or device-level consideration tailored to collaborative learning settings. We further highlight real-world deployments, ranging from recommender systems and mobile livestreaming to personal intelligent assistants. We finally point out open research directions to guide future development in this rapidly evolving field.