When Smaller Wins: Dual-Stage Distillation and Pareto-Guided Compression of Liquid Neural Networks for Edge Battery Prognostics

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the challenge of achieving high-accuracy state-of-health (SOH) prediction for batteries under the stringent resource constraints of edge devices. The authors propose a lightweight approach based on two-stage knowledge distillation and Pareto-guided compression. By preserving the temporal dynamics of liquid neural networks through a tailored distillation strategy and integrating Pareto-optimal model selection that jointly optimizes prediction error and computational cost—followed by int8 quantization—the method compresses a large teacher model down to 94 kB with an inference latency of only 21 ms. When deployed to predict SOH over the next 100 cycles, the resulting model achieves an average error of 0.0066, representing a 15.4% improvement over the original teacher model and demonstrating a rare case where a compact student model outperforms its larger counterpart.

Technology Category

Application Category

📝 Abstract

Battery management systems increasingly require accurate battery health prognostics under strict on-device constraints. This paper presents DLNet, a practical framework with dual-stage distillation of liquid neural networks that turns a high-capacity model into compact and edge-deployable models for battery health prediction. DLNet first applies Euler discretization to reformulate liquid dynamics for embedded compatibility. It then performs dual-stage knowledge distillation to transfer the teacher model's temporal behavior and recover it after further compression. Pareto-guided selection under joint error-cost objectives retains student models that balance accuracy and efficiency. We evaluate DLNet on a widely used dataset and validate real-device feasibility on an Arduino Nano 33 BLE Sense using int8 deployment. The final deployed student achieves a low error of 0.0066 when predicting battery health over the next 100 cycles, which is 15.4% lower than the teacher model. It reduces the model size from 616 kB to 94 kB with 84.7% reduction and takes 21 ms per inference on the device. These results support a practical smaller wins observation that a small model can match or exceed a large teacher for edge-based prognostics with proper supervision and selection. Beyond batteries, the DLNet framework can extend to other industrial analytics tasks with strict hardware constraints.

Problem

Research questions and friction points this paper is trying to address.

battery prognostics

edge computing

model compression

on-device constraints

liquid neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

liquid neural networks

knowledge distillation

Pareto-guided compression