🤖 AI Summary
This work addresses the challenge of efficient continual video learning on resource-constrained mobile edge devices—such as drones and dashcams—equipped with System-on-Chip (SoC) GPUs, which feature limited compute capability and unified memory architectures. We propose a lightweight, embedding-space-aware framework featuring: (i) a novel base-model residency mechanism leveraging embedding overlap; (ii) on-device incremental base-model updates; (iii) high-value sample selection; and (iv) time-division multiplexing of inference and training. The framework integrates SoC GPU computation optimization, embedding-space analysis, dynamic resource scheduling, and memory-aware sampling. Experiments demonstrate that our approach reduces retraining overhead by 2.8–10× and improves accuracy by 18%–45% over state-of-the-art edge continual learning systems, significantly enhancing both learning efficiency and accuracy under severe resource constraints.
📝 Abstract
Continually retraining models has emerged as a primary technique to enable high-accuracy video analytics on edge devices. Yet, existing systems employ such adaptation by relying on the spare compute resources that traditional (memory-constrained) edge servers afford. In contrast, mobile edge devices such as drones and dashcams offer a fundamentally different resource profile: weak(er) compute with abundant unified memory pools. We present Legilimens, a continuous learning system for the mobile edge's System-on-Chip GPUs. Our driving insight is that visually distinct scenes that require retraining exhibit substantial overlap in model embeddings; if captured into a base model on device memory, specializing to each new scene can become lightweight, requiring very few samples. To practically realize this approach, Legilimens presents new, compute-efficient techniques to (1) select high-utility data samples for retraining specialized models, (2) update the base model without complete retraining, and (3) time-share compute resources between retraining and live inference for maximal accuracy. Across diverse workloads, Legilimens lowers retraining costs by 2.8-10x compared to existing systems, resulting in 18-45% higher accuracies.