AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

To address the constraints of computation, memory, and labeled data for adapting pre-trained models on edge devices, this paper proposes a lightweight, gradient-free, and label-free layer selection method. The core innovation lies in the first application of Topological Data Analysis (TDA) to model adaptation: leveraging Betti numbers—computed from the activation spaces of individual layers during forward propagation—to quantify structural complexity and learning potential, thereby enabling gradient-free layer importance ranking. By eliminating reliance on backpropagation and meta-training, the method identifies highly adaptable layers via a single forward pass and selectively re-trains them. Evaluated across 16 model-dataset combinations, it achieves an average 5% improvement in classification accuracy and reduces peak memory usage by 40%, significantly outperforming existing gradient-based adaptation approaches.

Technology Category

Application Category

📝 Abstract

To utilize pre-trained neural networks on edge and mobile devices, we often require efficient adaptation to user-specific runtime data distributions while operating under limited compute and memory resources. On-device retraining with a target dataset can facilitate such adaptations; however, it remains impractical due to the increasing depth of modern neural nets, as well as the computational overhead associated with gradient-based optimization across all layers. Current approaches reduce training cost by selecting a subset of layers for retraining, however, they rely on labeled data, at least one full-model backpropagation, or server-side meta-training; limiting their suitability for constrained devices. We introduce AdaBet, a gradient-free layer selection approach to rank important layers by analyzing topological features of their activation spaces through Betti Numbers and using forward passes alone. AdaBet allows selecting layers with high learning capacity, which are important for retraining and adaptation, without requiring labels or gradients. Evaluating AdaBet on sixteen pairs of benchmark models and datasets, shows AdaBet achieves an average gain of 5% more classification accuracy over gradient-based baselines while reducing average peak memory consumption by 40%.

Problem

Research questions and friction points this paper is trying to address.

Efficiently adapts pre-trained neural networks for edge devices

Selects optimal layers for retraining without gradient computations

Reduces memory consumption while improving classification accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Betti Numbers for layer selection

Analyzes activation spaces without gradients

Requires only forward passes for ranking

🔎 Similar Papers

LaCoOT: Layer Collapse through Optimal Transport