🤖 AI Summary
In collaborative machine learning, the dynamic interplay between memory and knowledge in knowledge distillation (KD) remains poorly understood—particularly under distributed, hierarchical, and decentralized architectures, as well as under task-, model-, data-, resource-heterogeneity and privacy constraints. Method: This work introduces the first “memory–knowledge” dual-dimensional analytical framework, systematically decoupling their roles and interaction patterns across six major collaborative paradigms—including federated learning (FL) and multi-agent domain adaptation (MADA). Integrating KD theory, heterogeneity-aware modeling, and privacy-preserving principles, it establishes a cross-paradigm unified evaluation and taxonomy. Contributions: (1) The first systematic KD survey framework tailored to collaborative learning; (2) Identification of core challenges—namely inefficient memory management and excessive knowledge coupling; and (3) Proposal of novel research directions, including scalable memory management and adaptive knowledge decoupling.
📝 Abstract
Collaborative learning has emerged as a key paradigm in large-scale intelligent systems, enabling distributed agents to cooperatively train their models while addressing their privacy concerns. Central to this paradigm is knowledge distillation (KD), a technique that facilitates efficient knowledge transfer among agents. However, the underlying mechanisms by which KD leverages memory and knowledge across agents remain underexplored. This paper aims to bridge this gap by offering a comprehensive review of KD in collaborative learning, with a focus on the roles of memory and knowledge. We define and categorize memory and knowledge within the KD process and explore their interrelationships, providing a clear understanding of how knowledge is extracted, stored, and shared in collaborative settings. We examine various collaborative learning patterns, including distributed, hierarchical, and decentralized structures, and provide insights into how memory and knowledge dynamics shape the effectiveness of KD in collaborative learning. Particularly, we emphasize task heterogeneity in distributed learning pattern covering federated learning (FL), multi-agent domain adaptation (MADA), federated multi-modal learning (FML), federated continual learning (FCL), federated multi-task learning (FMTL), and federated graph knowledge embedding (FKGE). Additionally, we highlight model heterogeneity, data heterogeneity, resource heterogeneity, and privacy concerns of these tasks. Our analysis categorizes existing work based on how they handle memory and knowledge. Finally, we discuss existing challenges and propose future directions for advancing KD techniques in the context of collaborative learning.