Polymorph: Energy-Efficient Multi-Label Classification for Video Streams on Embedded Devices

📅 2025-07-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address computational and energy constraints in real-time multi-label video classification on embedded devices, this paper proposes a context-aware dynamic modular inference framework. The method exploits label sparsity, temporal continuity, and co-occurrence patterns in video sequences to construct a lightweight low-rank adapter (LoRA) pool; it dynamically activates only the most semantically relevant subset of adapters per frame, thereby avoiding full-model switching and weight merging. The backbone network is shared across tasks, while adapters are composed on-demand, significantly improving energy efficiency and scalability. Evaluated on the TAO dataset, the framework reduces energy consumption by 40% compared to strong baselines while increasing mean average precision (mAP) by 9 percentage points—achieving joint optimization of high accuracy and low power consumption.

Technology Category

Application Category

📝 Abstract
Real-time multi-label video classification on embedded devices is constrained by limited compute and energy budgets. Yet, video streams exhibit structural properties such as label sparsity, temporal continuity, and label co-occurrence that can be leveraged for more efficient inference. We introduce Polymorph, a context-aware framework that activates a minimal set of lightweight Low Rank Adapters (LoRA) per frame. Each adapter specializes in a subset of classes derived from co-occurrence patterns and is implemented as a LoRA weight over a shared backbone. At runtime, Polymorph dynamically selects and composes only the adapters needed to cover the active labels, avoiding full-model switching and weight merging. This modular strategy improves scalability while reducing latency and energy overhead. Polymorph achieves 40% lower energy consumption and improves mAP by 9 points over strong baselines on the TAO dataset. Polymorph is open source at https://github.com/inference-serving/polymorph/.
Problem

Research questions and friction points this paper is trying to address.

Real-time multi-label video classification on resource-limited embedded devices
Leveraging video structural properties for efficient inference
Reducing energy consumption and latency in dynamic label prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses lightweight Low Rank Adapters (LoRA)
Dynamically selects adapters per frame
Reduces energy and improves accuracy
🔎 Similar Papers
No similar papers found.