🤖 AI Summary
To address overfitting and insufficient training diversity caused by fixed dropout rates in deep neural networks, this paper proposes Adaptive Tabu Dropout (ATD). ATD innovatively introduces tabu search principles into regularization: it dynamically constructs a tabu set of weights to suppress updates of highly important parameters; adaptively adjusts the dropout rate based on gradient sensitivity; and estimates parameter importance via Hessian approximation. This enables joint optimization of dropout rate and parameter importance. Experiments on CIFAR-10, CIFAR-100, and an ImageNet subset demonstrate that ATD consistently improves generalization accuracy by 1.2–2.7% over baseline methods—including standard dropout and DropBlock—while enhancing training stability. The key contribution lies in the first integration of tabu search heuristics into stochastic regularization, enabling context-aware, importance-driven neuron deactivation during training.