🤖 AI Summary
Current large language models (LLMs) underperform specialized fine-tuned models on aspect-based sentiment analysis (ABSA), and their potential for this fine-grained task remains systematically unexplored. This work presents the first systematic investigation of full-parameter fine-tuning of open-source LLaMA-based models—specifically Orca-2—for ABSA, evaluated across eight English benchmark datasets and four core ABSA subtasks (aspect term extraction, aspect category detection, opinion term extraction, and sentiment polarity classification). Results demonstrate that fine-tuned Orca-2 achieves new state-of-the-art performance on all four subtasks. However, it exhibits significant limitations in zero-shot and few-shot settings. Error analysis identifies key bottlenecks in aspect boundary identification, sentiment polarity disambiguation, and cross-subtask consistency. This study provides empirical evidence and a reproducible technical pipeline for adapting foundation LLMs to fine-grained sentiment analysis, advancing both methodological understanding and practical deployment.
📝 Abstract
While large language models (LLMs) show promise for various tasks, their performance in compound aspect-based sentiment analysis (ABSA) tasks lags behind fine-tuned models. However, the potential of LLMs fine-tuned for ABSA remains unexplored. This paper examines the capabilities of open-source LLMs fine-tuned for ABSA, focusing on LLaMA-based models. We evaluate the performance across four tasks and eight English datasets, finding that the fine-tuned Orca~2 model surpasses state-of-the-art results in all tasks. However, all models struggle in zero-shot and few-shot scenarios compared to fully fine-tuned ones. Additionally, we conduct error analysis to identify challenges faced by fine-tuned models.