Legilimens: Performant Video Analytics on the System-on-Chip Edge

📅 2025-04-29

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the challenge of efficient continual video learning on resource-constrained mobile edge devices—such as drones and dashcams—equipped with System-on-Chip (SoC) GPUs, which feature limited compute capability and unified memory architectures. We propose a lightweight, embedding-space-aware framework featuring: (i) a novel base-model residency mechanism leveraging embedding overlap; (ii) on-device incremental base-model updates; (iii) high-value sample selection; and (iv) time-division multiplexing of inference and training. The framework integrates SoC GPU computation optimization, embedding-space analysis, dynamic resource scheduling, and memory-aware sampling. Experiments demonstrate that our approach reduces retraining overhead by 2.8–10× and improves accuracy by 18%–45% over state-of-the-art edge continual learning systems, significantly enhancing both learning efficiency and accuracy under severe resource constraints.

Technology Category

Application Category

📝 Abstract

Continually retraining models has emerged as a primary technique to enable high-accuracy video analytics on edge devices. Yet, existing systems employ such adaptation by relying on the spare compute resources that traditional (memory-constrained) edge servers afford. In contrast, mobile edge devices such as drones and dashcams offer a fundamentally different resource profile: weak(er) compute with abundant unified memory pools. We present Legilimens, a continuous learning system for the mobile edge's System-on-Chip GPUs. Our driving insight is that visually distinct scenes that require retraining exhibit substantial overlap in model embeddings; if captured into a base model on device memory, specializing to each new scene can become lightweight, requiring very few samples. To practically realize this approach, Legilimens presents new, compute-efficient techniques to (1) select high-utility data samples for retraining specialized models, (2) update the base model without complete retraining, and (3) time-share compute resources between retraining and live inference for maximal accuracy. Across diverse workloads, Legilimens lowers retraining costs by 2.8-10x compared to existing systems, resulting in 18-45% higher accuracies.

Problem

Research questions and friction points this paper is trying to address.

Enables high-accuracy video analytics on memory-rich edge devices

Reduces retraining costs for specialized models on mobile edge GPUs

Balances retraining and live inference for optimal performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes abundant unified memory on mobile edge devices

Employs compute-efficient sample selection for retraining

Time-shares resources between retraining and live inference

🔎 Similar Papers

Region-based Content Enhancement for Efficient Video Analytics at the Edge