Should We Always Train Models on Fine-Grained Classes?

📅 2025-09-05

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Does fine-grained training inherently improve performance in hierarchical label classification? This paper systematically investigates this question, demonstrating that its efficacy depends critically on the alignment between data geometry and the label hierarchy—not merely on the granularity of annotations. Using both real-world and synthetic hierarchical datasets, the study conducts extensive classification experiments and geometric analyses of embedding spaces (e.g., intra-class/inter-class distances, hierarchical consistency metrics). Results reveal that data scale and model capacity act as key moderators: under limited data or low-capacity models, fine-grained training often induces overfitting or hierarchical confusion, degrading top-level generalization. Crucially, this work formally characterizes the *effective boundary* of fine-grained training for the first time—establishing theoretical conditions under which it benefits hierarchical classification. The findings provide principled guidance for annotation strategy design and model architecture selection in hierarchical classification tasks.

Technology Category

Application Category

📝 Abstract

In classification problems, models must predict a class label based on the input data features. However, class labels are organized hierarchically in many datasets. While a classification task is often defined at a specific level of this hierarchy, training can utilize a finer granularity of labels. Empirical evidence suggests that such fine-grained training can enhance performance. In this work, we investigate the generality of this observation and explore its underlying causes using both real and synthetic datasets. We show that training on fine-grained labels does not universally improve classification accuracy. Instead, the effectiveness of this strategy depends critically on the geometric structure of the data and its relations with the label hierarchy. Additionally, factors such as dataset size and model capacity significantly influence whether fine-grained labels provide a performance benefit.

Problem

Research questions and friction points this paper is trying to address.

Investigating if fine-grained training universally improves classification accuracy

Exploring how data geometry and label hierarchy affect training strategy

Assessing impact of dataset size and model capacity on performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training on fine-grained labels not universally beneficial

Effectiveness depends on data geometric structure

Dataset size and model capacity influence performance

🔎 Similar Papers

No similar papers found.