Towards Robust Few-Shot Text Classification Using Transformer Architectures and Dual Loss Strategies

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak generalization in low-resource few-shot text classification—caused by ambiguous semantic boundaries and complex feature distributions—this paper proposes a unified training framework integrating adaptive fine-tuning, contrastive learning, and multi-task regularization. It introduces, for the first time in few-shot relation classification, a synergistic optimization mechanism jointly leveraging contrastive loss and regularization loss to mitigate overfitting on scarce samples and uncover intrinsic difficulty variations across semantic relation categories, along with their underlying causes. Evaluated on FewRel 2.0 under the 5-shot setting, our approach achieves significant accuracy gains using T5-small, DeBERTa-v3, and RoBERTa-base backbones. It demonstrates superior robustness, generalization, and stability over existing baselines—particularly on relations with ambiguous boundaries and heterogeneous feature distributions.

Technology Category

Application Category

📝 Abstract
Few-shot text classification has important application value in low-resource environments. This paper proposes a strategy that combines adaptive fine-tuning, contrastive learning, and regularization optimization to improve the classification performance of Transformer-based models. Experiments on the FewRel 2.0 dataset show that T5-small, DeBERTa-v3, and RoBERTa-base perform well in few-shot tasks, especially in the 5-shot setting, which can more effectively capture text features and improve classification accuracy. The experiment also found that there are significant differences in the classification difficulty of different relationship categories. Some categories have fuzzy semantic boundaries or complex feature distributions, making it difficult for the standard cross entropy loss to learn the discriminative information required to distinguish categories. By introducing contrastive loss and regularization loss, the generalization ability of the model is enhanced, effectively alleviating the overfitting problem in few-shot environments. In addition, the research results show that the use of Transformer models or generative architectures with stronger self-attention mechanisms can help improve the stability and accuracy of few-shot classification.
Problem

Research questions and friction points this paper is trying to address.

Improving few-shot text classification with Transformer architectures and dual loss strategies
Addressing overfitting in few-shot learning via contrastive and regularization losses
Enhancing model generalization for low-resource text classification tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines adaptive fine-tuning and contrastive learning
Introduces contrastive and regularization loss enhancements
Utilizes Transformer models with self-attention mechanisms
🔎 Similar Papers
No similar papers found.
X
Xu Han
Brown University
Y
Yumeng Sun
Rochester Institute of Technology
Weiqiang Huang
Weiqiang Huang
Northeastern University
AI/MLReinforcement LearningEdge ComputingNatural Language Processing
H
Hongye Zheng
The Chinese University of Hong Kong
Junliang Du
Junliang Du
Shanghai Jiao Tong University
Bayesian Methodology