Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scarcity of aspect-based sentiment analysis (ABSA) resources for low-resource languages like Czech, particularly the lack of datasets annotated with opinion terms and effective cross-lingual transfer methods. To bridge this gap, the authors construct the first Czech ABSA benchmark dataset in the restaurant domain, featuring opinion-term annotations and supporting three levels of task complexity. They systematically evaluate a range of Transformer and large language models (LLMs) under monolingual, cross-lingual, and multilingual settings, and propose an LLM-driven translation–label alignment strategy to enhance cross-lingual transfer. Experimental results demonstrate that the proposed approach significantly outperforms baseline methods on Czech ABSA, while also revealing limitations of current models in capturing fine-grained opinion expressions, thereby establishing a new benchmark for sentiment analysis in low-resource languages.

Technology Category

Application Category

📝 Abstract
This paper introduces a novel Czech dataset in the restaurant domain for aspect-based sentiment analysis (ABSA), enriched with annotations of opinion terms. The dataset supports three distinct ABSA tasks involving opinion terms, accommodating varying levels of complexity. Leveraging this dataset, we conduct extensive experiments using modern Transformer-based models, including large language models (LLMs), in monolingual, cross-lingual, and multilingual settings. To address cross-lingual challenges, we propose a translation and label alignment methodology leveraging LLMs, which yields consistent improvements. Our results highlight the strengths and limitations of state-of-the-art models, especially when handling the linguistic intricacies of low-resource languages like Czech. A detailed error analysis reveals key challenges, including the detection of subtle opinion terms and nuanced sentiment expressions. The dataset establishes a new benchmark for Czech ABSA, and our proposed translation-alignment approach offers a scalable solution for adapting ABSA resources to other low-resource languages.
Problem

Research questions and friction points this paper is trying to address.

aspect-based sentiment analysis
opinion terms
low-resource languages
Czech
cross-lingual challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

aspect-based sentiment analysis
opinion term annotation
low-resource language
large language models
translation-label alignment
🔎 Similar Papers
No similar papers found.
Jakub Šmíd
Jakub Šmíd
PhD student at University of West Bohemia
Machine LearningNatural Language ProcessingSentiment Analysis
Pavel Přibáň
Pavel Přibáň
Sentisquare, University of West Bohemia
NLPmachine learning
P
Pavel Král
Department of Computer Science and Engineering, NTIS – New Technologies for the Information Society, University of West Bohemia in Pilsen, Faculty of Applied Sciences