🤖 AI Summary
This work addresses the limitation of conventional ensemble methods in severely imbalanced text classification, where predictions are often biased toward majority classes at the expense of minority class performance. To mitigate this issue, the authors propose CAMO, a novel ensemble framework that uniquely integrates class-awareness and minority-class optimization into ensemble learning. CAMO employs a hierarchical pipeline that combines voting distribution analysis, confidence calibration, and inter-model uncertainty modeling to dynamically reweight predictions in favor of minority classes. The framework is compatible with both large language models (LLMs) and small language models (SLMs) and synergizes effectively with model adaptation strategies. Evaluated on the highly imbalanced DIAR-AI/Emotion and BEA 2025 datasets under fine-tuning settings, CAMO consistently achieves state-of-the-art strict macro F1 scores, significantly outperforming existing approaches and establishing a new benchmark.
📝 Abstract
Real-world categorization is severely hampered by class imbalance because traditional ensembles favor majority classes, which lowers minority performance and overall F1-score. We provide a unique ensemble technique for imbalanced problems called CAMO (Class-Aware Minority-Optimized).Through a hierarchical procedure that incorporates vote distributions, confidence calibration, and inter model uncertainty, CAMO dynamically boosts underrepresented classes while preserving and amplifying minority forecasts.We verify CAMO on two highly unbalanced, domain-specific benchmarks: the DIAR-AI/Emotion dataset and the ternary BEA 2025 dataset. We benchmark against seven proven ensemble algorithms using eight different language models (three LLMs and five SLMs) under zero-shot and fine-tuned settings .With refined models, CAMO consistently earns the greatest strict macro F1-score, setting a new benchmark. Its benefit works in concert with model adaptation, showing that the best ensemble choice depends on model properties .This proves that CAMO is a reliable, domain-neutral framework for unbalanced categorization.