๐ค AI Summary
Deploying CNNs on resource-constrained devices faces challenges of high computational overhead and inflexible architectures. Method: This paper proposes an elastic CNN architecture enabling zero-shot, runtime adaptation to multiple granularities of computational complexity without fine-tuning. It introduces a novel pruning-growth co-design paradigm for nested subnetwork construction, integrating structured pruning with dynamic subnet reconfiguration to enable seamless switching between compact and full configurations within a single model. Contribution/Results: Evaluated on VGG-16, AlexNet, and ResNet across CIFAR-10 and Imagenette, the approach reduces computation by 40% with <1.2% accuracy degradationโand in some cases surpasses baseline accuracy. To our knowledge, this is the first work achieving real-time, training-free capacity adaptation, significantly enhancing deployment flexibility and energy efficiency of edge AI models.
๐ Abstract
Deploying deep convolutional neural networks (CNNs) on resource-constrained devices presents significant challenges due to their high computational demands and rigid, static architectures. To overcome these limitations, this thesis explores methods for enabling CNNs to dynamically adjust their computational complexity based on available hardware resources. We introduce adaptive CNN architectures capable of scaling their capacity at runtime, thus efficiently balancing performance and resource utilization. To achieve this adaptability, we propose a structured pruning and dynamic re-construction approach that creates nested subnetworks within a single CNN model. This approach allows the network to dynamically switch between compact and full-sized configurations without retraining, making it suitable for deployment across varying hardware platforms. Experiments conducted across multiple CNN architectures including VGG-16, AlexNet, ResNet-20, and ResNet-56 on CIFAR-10 and Imagenette datasets demonstrate that adaptive models effectively maintain or even enhance performance under varying computational constraints. Our results highlight that embedding adaptability directly into CNN architectures significantly improves their robustness and flexibility, paving the way for efficient real-world deployment in diverse computational environments.