On the Impact of Language Nuances on Sentiment Analysis with Large Language Models: Paraphrasing, Sarcasm, and Emojis

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the limited robustness of large language models (LLMs) in sentiment analysis—particularly toward pragmatic phenomena such as irony, emojis, and fragmented language—and their poor generalization to domain-specific corpora (e.g., nuclear energy). We propose a synergistic optimization framework integrating textual paraphrasing, irony detection and removal, adversarial augmentation, and domain-adaptive fine-tuning. We construct a high-quality, manually annotated dataset of ironic tweets. We introduce the novel joint strategy of “irony removal + topic-agnostic pretraining” and empirically validate the critical contribution of general-domain corpora to irony comprehension. Experiments show that irony removal improves sentiment accuracy by 21 percentage points (to 51%); fine-tuning on general-domain data achieves 60% irony detection accuracy; adversarial augmentation yields 85% robustness against perturbations; and paraphrasing upgrades 40% of low-confidence predictions, boosting overall sentiment accuracy by 6%.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated impressive performance across various tasks, including sentiment analysis. However, data quality--particularly when sourced from social media--can significantly impact their accuracy. This research explores how textual nuances, including emojis and sarcasm, affect sentiment analysis, with a particular focus on improving data quality through text paraphrasing techniques. To address the lack of labeled sarcasm data, the authors created a human-labeled dataset of 5929 tweets that enabled the assessment of LLM in various sarcasm contexts. The results show that when topic-specific datasets, such as those related to nuclear power, are used to finetune LLMs these models are not able to comprehend accurate sentiment in presence of sarcasm due to less diverse text, requiring external interventions like sarcasm removal to boost model accuracy. Sarcasm removal led to up to 21% improvement in sentiment accuracy, as LLMs trained on nuclear power-related content struggled with sarcastic tweets, achieving only 30% accuracy. In contrast, LLMs trained on general tweet datasets, covering a broader range of topics, showed considerable improvements in predicting sentiment for sarcastic tweets (60% accuracy), indicating that incorporating general text data can enhance sarcasm detection. The study also utilized adversarial text augmentation, showing that creating synthetic text variants by making minor changes significantly increased model robustness and accuracy for sarcastic tweets (approximately 85%). Additionally, text paraphrasing of tweets with fragmented language transformed around 40% of the tweets with low-confidence labels into high-confidence ones, improving LLMs sentiment analysis accuracy by 6%.

Problem

Research questions and friction points this paper is trying to address.

Investigates how emojis and sarcasm affect sentiment analysis accuracy

Addresses lack of labeled sarcasm data with a human-labeled tweet dataset

Explores text paraphrasing to improve low-confidence sentiment labels

Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-labeled sarcasm dataset enhances LLM assessment

Adversarial text augmentation boosts model robustness

Text paraphrasing improves sentiment analysis accuracy

🔎 Similar Papers

Do Large Language Models Possess Sensitive to Sentiment?