🤖 AI Summary
This work addresses the challenge of accurately predicting deep, rare classes in hierarchical multi-label classification, where such categories suffer from intrinsic low frequency and further diminished prevalence due to hierarchical propagation. To this end, we propose a novel loss function that explicitly focuses on rare nodes—rather than rare samples—by integrating node-level class imbalance weighting with a focal weighting mechanism grounded in ensemble-based uncertainty quantification. This approach dynamically adjusts training emphasis based on model uncertainty and seamlessly integrates into mainstream neural architectures such as CNNs. Extensive experiments demonstrate substantial improvements, with recall gains up to fivefold and significantly higher F₁ scores compared to baseline methods. Notably, the proposed method maintains robust performance even under challenging conditions, including suboptimal encoders or scarce data regimes.
📝 Abstract
In hierarchical multi-label classification, a persistent challenge is enabling model predictions to reach deeper levels of the hierarchy for more detailed or fine-grained classifications. This difficulty partly arises from the natural rarity of certain classes (or hierarchical nodes) and the hierarchical constraint that ensures child nodes are almost always less frequent than their parents. To address this, we propose a weighted loss objective for neural networks that combines node-wise imbalance weighting with focal weighting components, the latter leveraging modern quantification of ensemble uncertainties. By emphasizing rare nodes rather than rare observations (data points), and focusing on uncertain nodes for each model output distribution during training, we observe improvements in recall by up to a factor of five on benchmark datasets, along with statistically significant gains in $F_{1}$ score. We also show our approach aids convolutional networks on challenging tasks, as in situations with suboptimal encoders or limited data.