Multilingual Sentiment Analysis of Summarized Texts: A Cross-Language Study of Text Shortening Effects

📅 2025-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies a systematic interference effect of text summarization—specifically, extractive versus abstractive approaches—on multilingual sentiment analysis, particularly inducing substantial sentiment distortion (12–18% accuracy drop) in morphologically complex languages (e.g., Finnish, Hungarian, Arabic). Evaluating across eight languages using mBERT, XLM-RoBERTa, T5, BART, and language-specific models (FinBERT, AraBERT), it provides the first empirical characterization of cross-lingual sentiment distortion induced by summarization. Methodologically, it conducts controlled comparative experiments quantifying sentiment shift under varying summarization strategies. The work makes two key contributions: (i) it empirically validates that extractive summarization preserves sentiment fidelity more effectively than abstractive methods in morphologically rich languages; and (ii) it proposes a novel hybrid summarization paradigm that jointly optimizes readability and sentiment consistency. Results underscore the necessity of morphological adaptation in multilingual sentiment analysis and provide both theoretical grounding and practical methodology for joint summarization–sentiment modeling in low-resource and highly inflected languages.

Technology Category

Application Category

📝 Abstract
Summarization significantly impacts sentiment analysis across languages with diverse morphologies. This study examines extractive and abstractive summarization effects on sentiment classification in English, German, French, Spanish, Italian, Finnish, Hungarian, and Arabic. We assess sentiment shifts post-summarization using multilingual transformers (mBERT, XLM-RoBERTa, T5, and BART) and language-specific models (FinBERT, AraBERT). Results show extractive summarization better preserves sentiment, especially in morphologically complex languages, while abstractive summarization improves readability but introduces sentiment distortion, affecting sentiment accuracy. Languages with rich inflectional morphology, such as Finnish, Hungarian, and Arabic, experience greater accuracy drops than English or German. Findings emphasize the need for language-specific adaptations in sentiment analysis and propose a hybrid summarization approach balancing readability and sentiment preservation. These insights benefit multilingual sentiment applications, including social media monitoring, market analysis, and cross-lingual opinion mining.
Problem

Research questions and friction points this paper is trying to address.

Examining summarization effects on sentiment analysis across multiple languages
Comparing extractive and abstractive summarization impacts on sentiment accuracy
Proposing hybrid summarization for balancing readability and sentiment preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multilingual transformers for sentiment analysis
Compares extractive and abstractive summarization effects
Proposes hybrid summarization for sentiment preservation
🔎 Similar Papers
No similar papers found.