Noisy Deep Ensemble: Accelerating Deep Ensemble Learning via Noise Injection

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Deep neural network ensembles suffer from linearly increasing training costs with the number of ensemble members. To address this, we propose an efficient weight-perturbation-based ensemble method: starting from a single converged parent model, we generate multiple structurally identical yet behaviorally diverse child models by injecting diversified noise—including Gaussian noise, random sign perturbations, and gradient-aligned perturbations—into the weight space, thereby eliminating the need for repeated training. This work is the first to systematically apply weight-space noise injection for rapid deep ensemble construction, employing simple averaging of predictions in CNN architectures. Experiments on CIFAR-10 and CIFAR-100 demonstrate that our method achieves test accuracy comparable to standard deep ensembles, while reducing training time significantly. Moreover, it outperforms existing efficient ensemble approaches in both accuracy and efficiency, offering a practical trade-off between performance and computational cost.

Technology Category

Application Category

📝 Abstract

Neural network ensembles is a simple yet effective approach for enhancing generalization capabilities. The most common method involves independently training multiple neural networks initialized with different weights and then averaging their predictions during inference. However, this approach increases training time linearly with the number of ensemble members. To address this issue, we propose the novel `` extbf{Noisy Deep Ensemble}'' method, significantly reducing the training time required for neural network ensembles. In this method, a extit{parent model} is trained until convergence, and then the weights of the extit{parent model} are perturbed in various ways to construct multiple extit{child models}. This perturbation of the extit{parent model} weights facilitates the exploration of different local minima while significantly reducing the training time for each ensemble member. We evaluated our method using diverse CNN architectures on CIFAR-10 and CIFAR-100 datasets, surpassing conventional efficient ensemble methods and achieving test accuracy comparable to standard ensembles. Code is available at href{https://github.com/TSTB-dev/NoisyDeepEnsemble}{https://github.com/TSTB-dev/NoisyDeepEnsemble}

Problem

Research questions and friction points this paper is trying to address.

Reducing training time for neural network ensembles

Exploring local minima via perturbed parent model weights

Achieving comparable accuracy with fewer computational resources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise injection accelerates ensemble training

Parent model weights perturbed for child models

Reduces training time while maintaining accuracy

🔎 Similar Papers

(Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models