🤖 AI Summary
To address data scarcity and class imbalance in sentiment classification, this paper systematically evaluates three large language model (LLM)-based data augmentation paradigms—back-translation, rewriting, and zero-/few-shot generation—using GPT-series models. Experimental results demonstrate that LLM-driven back-translation and rewriting consistently outperform purely generative augmentation across most settings, achieving performance on par with or exceeding that of few-shot generation while incurring lower computational cost and offering greater controllability. The key contribution lies in revealing that lightweight, prompt-guided text transformation—not end-to-end generation—suffices to significantly enhance the robustness and generalization of supervised classifiers. This insight establishes a reproducible, cost-effective augmentation strategy for low-resource sentiment analysis, bridging the gap between practical deployability and performance efficacy in resource-constrained scenarios.
📝 Abstract
Numerous domain-specific machine learning tasks struggle with data scarcity and class imbalance. This paper systematically explores data augmentation methods for NLP, particularly through large language models like GPT. The purpose of this paper is to examine and evaluate whether traditional methods such as paraphrasing and backtranslation can leverage a new generation of models to achieve comparable performance to purely generative methods. Methods aimed at solving the problem of data scarcity and utilizing ChatGPT were chosen, as well as an exemplary dataset. We conducted a series of experiments comparing four different approaches to data augmentation in multiple experimental setups. We then evaluated the results both in terms of the quality of generated data and its impact on classification performance. The key findings indicate that backtranslation and paraphrasing can yield comparable or even better results than zero and a few-shot generation of examples.