Backtranslation and paraphrasing in the LLM era? Comparing data augmentation methods for emotion classification

📅 2025-07-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address data scarcity and class imbalance in sentiment classification, this paper systematically evaluates three large language model (LLM)-based data augmentation paradigms—back-translation, rewriting, and zero-/few-shot generation—using GPT-series models. Experimental results demonstrate that LLM-driven back-translation and rewriting consistently outperform purely generative augmentation across most settings, achieving performance on par with or exceeding that of few-shot generation while incurring lower computational cost and offering greater controllability. The key contribution lies in revealing that lightweight, prompt-guided text transformation—not end-to-end generation—suffices to significantly enhance the robustness and generalization of supervised classifiers. This insight establishes a reproducible, cost-effective augmentation strategy for low-resource sentiment analysis, bridging the gap between practical deployability and performance efficacy in resource-constrained scenarios.

Technology Category

Application Category

📝 Abstract
Numerous domain-specific machine learning tasks struggle with data scarcity and class imbalance. This paper systematically explores data augmentation methods for NLP, particularly through large language models like GPT. The purpose of this paper is to examine and evaluate whether traditional methods such as paraphrasing and backtranslation can leverage a new generation of models to achieve comparable performance to purely generative methods. Methods aimed at solving the problem of data scarcity and utilizing ChatGPT were chosen, as well as an exemplary dataset. We conducted a series of experiments comparing four different approaches to data augmentation in multiple experimental setups. We then evaluated the results both in terms of the quality of generated data and its impact on classification performance. The key findings indicate that backtranslation and paraphrasing can yield comparable or even better results than zero and a few-shot generation of examples.
Problem

Research questions and friction points this paper is trying to address.

Addressing data scarcity in emotion classification tasks
Comparing traditional vs. generative data augmentation methods
Evaluating LLM-enhanced backtranslation and paraphrasing performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizing ChatGPT for data augmentation
Comparing backtranslation and paraphrasing methods
Evaluating impact on classification performance
🔎 Similar Papers
No similar papers found.
Ł
Łukasz Radliński
Department of Artificial Intelligence, Wrocław University of Science and Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland
M
Mateusz Guściora
Department of Artificial Intelligence, Wrocław University of Science and Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland
Jan Kocoń
Jan Kocoń
Department of Artificial Intelligence, Wroclaw University of Science and Technology
Artificial IntelligenceNatural Language ProcessingLarge Language ModelsTransformersPersonalized NLP