EdgeVidSum: Real-Time Personalized Video Summarization at the Edge

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of low computational efficiency, insufficient personalization, and privacy leakage in real-time personalized video summarization on edge devices, this paper proposes a lightweight, on-device fast-forward summarization method. Our approach introduces a novel thumbnail container mechanism and a hierarchical lightweight analysis framework, integrating localized thumbnail semantic extraction, user-preference-driven timestamp generation, and Jetson Nano-specific optimization to enable fully on-device processing—ensuring data privacy and minimizing latency. Employing a compact 2D CNN, the method achieves end-to-end real-time summarization on Jetson Nano: it accelerates processing 12× over frame-level baselines, consumes less than 300 MB of memory, and attains >89% accuracy in matching user preferences. To the best of our knowledge, this is the first work to jointly leverage thumbnail containers and hierarchical semantic analysis for edge-based video summarization, uniquely balancing high relevance, ultra-low resource overhead, and strong personalization capability.

Technology Category

Application Category

📝 Abstract
EdgeVidSum is a lightweight method that generates personalized, fast-forward summaries of long-form videos directly on edge devices. The proposed approach enables real-time video summarization while safeguarding user privacy through local data processing using innovative thumbnail-based techniques and efficient neural architectures. Unlike conventional methods that process entire videos frame by frame, the proposed method uses thumbnail containers to significantly reduce computational complexity without sacrificing semantic relevance. The framework employs a hierarchical analysis approach, where a lightweight 2D CNN model identifies user-preferred content from thumbnails and generates timestamps to create fast-forward summaries. Our interactive demo highlights the system's ability to create tailored video summaries for long-form videos, such as movies, sports events, and TV shows, based on individual user preferences. The entire computation occurs seamlessly on resource-constrained devices like Jetson Nano, demonstrating how EdgeVidSum addresses the critical challenges of computational efficiency, personalization, and privacy in modern video consumption environments.
Problem

Research questions and friction points this paper is trying to address.

Real-time personalized video summarization on edge devices
Reducing computational complexity with thumbnail-based techniques
Ensuring privacy through local data processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight thumbnail-based video summarization
Real-time processing on edge devices
Hierarchical 2D CNN for personalization
🔎 Similar Papers
No similar papers found.