Instance-dependent Early Stopping

📅 2025-02-11

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Traditional early stopping operates uniformly at the global epoch level, ignoring inter-sample variations in learning progress—causing already-mastered samples to persistently undergo backpropagation and incur computational redundancy. This paper proposes an instance-level adaptive early stopping method, the first to refine stopping granularity to individual training samples. Leveraging second-order loss differences, we formulate a stable mastery criterion and dynamically determine, under a unified threshold, whether each sample has been sufficiently learned. Gradient masking is then applied to selectively prune backpropagation for mastered samples. The approach requires no auxiliary models or external supervision, significantly reducing computational overhead: it decreases the number of backpropagated samples by 10%–50% across multiple benchmark datasets, yielding substantial training speedup while preserving or marginally improving test accuracy and transfer performance.

Technology Category

Application Category

📝 Abstract

In machine learning practice, early stopping has been widely used to regularize models and can save computational costs by halting the training process when the model's performance on a validation set stops improving. However, conventional early stopping applies the same stopping criterion to all instances without considering their individual learning statuses, which leads to redundant computations on instances that are already well-learned. To further improve the efficiency, we propose an Instance-dependent Early Stopping (IES) method that adapts the early stopping mechanism from the entire training set to the instance level, based on the core principle that once the model has mastered an instance, the training on it should stop. IES considers an instance as mastered if the second-order differences of its loss value remain within a small range around zero. This offers a more consistent measure of an instance's learning status compared with directly using the loss value, and thus allows for a unified threshold to determine when an instance can be excluded from further backpropagation. We show that excluding mastered instances from backpropagation can increase the gradient norms, thereby accelerating the decrease of the training loss and speeding up the training process. Extensive experiments on benchmarks demonstrate that IES method can reduce backpropagation instances by 10%-50% while maintaining or even slightly improving the test accuracy and transfer learning performance of a model.

Problem

Research questions and friction points this paper is trying to address.

Adapts early stopping to instance level

Excludes mastered instances from backpropagation

Increases training efficiency and reduces computational costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Instance-dependent Early Stopping

Second-order loss differences

Reduces backpropagation instances

🔎 Similar Papers

No similar papers found.