TinyDéjàVu: Smaller Memory Footprint & Faster Inference on Sensor Data Streams with Always-On Microcontrollers

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the dual bottlenecks of severe memory constraints and computational redundancy in deploying lightweight neural networks for continuous inference on battery-powered, always-on microcontrollers with only 128 KB RAM. We propose the first inter-layer dataflow co-optimization framework tailored for temporal sensor streams. Our approach jointly integrates operator-level memory reuse, incremental state caching, and sliding-window-aware computation pruning to eliminate redundant operations and minimize RAM footprint simultaneously under streaming sliding-window inputs. The framework supports plug-and-play deployment of multiple models without recompilation. Evaluated on real MCU hardware, it reduces peak RAM usage by over 60% and cuts redundant computations by up to 90%. The implementation is fully open-sourced, ensuring end-to-end reproducibility.

Technology Category

Application Category

📝 Abstract
Always-on sensors are increasingly expected to embark a variety of tiny neural networks and to continuously perform inference on time-series of the data they sense. In order to fit lifetime and energy consumption requirements when operating on battery, such hardware uses microcontrollers (MCUs) with tiny memory budget e.g., 128kB of RAM. In this context, optimizing data flows across neural network layers becomes crucial. In this paper, we introduce TinyDéjàVu, a new framework and novel algorithms we designed to drastically reduce the RAM footprint required by inference using various tiny ML models for sensor data time-series on typical microcontroller hardware. We publish the implementation of TinyDéjàVu as open source, and we perform reproducible benchmarks on hardware. We show that TinyDéjàVu can save more than 60% of RAM usage and eliminate up to 90% of redundant compute on overlapping sliding window inputs.
Problem

Research questions and friction points this paper is trying to address.

Reduces RAM footprint for neural network inference on microcontrollers
Optimizes data flows across neural network layers for sensor streams
Eliminates redundant compute on overlapping sliding window inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework reduces RAM footprint for microcontroller inference
Algorithms eliminate redundant compute on overlapping sensor data
Open-source implementation enables efficient always-on sensor processing
🔎 Similar Papers
No similar papers found.