🤖 AI Summary
This study addresses the scarcity of annotated resources for Sinhala sentiment analysis and inherent biases in user reviews by introducing GeeSanBhava—the first high-quality, manually annotated Sinhala YouTube music-comment sentiment dataset—grounded in Russell’s valence-arousal model and annotated collaboratively by multiple annotators (Fleiss’ κ = 0.8496). Methodologically, we propose a cross-modal sentiment correlation modeling framework that explicitly contrasts sentiment mappings between textual comments and musical audio features; sentiment classification is performed using a Sinhala news–pretrained language model augmented with a hyperparameter-optimized three-layer MLP (256–128–64), achieving fine-grained classification (ROC-AUC = 0.887). Key contributions include: (1) establishing the first benchmark dataset for Sinhala music sentiment analysis; (2) empirically validating systematic discrepancies between comment sentiment and musical affect; and (3) enabling zero-shot transfer learning and cross-modal bias mitigation research.
📝 Abstract
This study introduce GeeSanBhava, a high-quality data set of Sinhala song comments extracted from YouTube manually tagged using Russells Valence-Arousal model by three independent human annotators. The human annotators achieve a substantial inter-annotator agreement (Fleiss kappa = 84.96%). The analysis revealed distinct emotional profiles for different songs, highlighting the importance of comment based emotion mapping. The study also addressed the challenges of comparing comment-based and song-based emotions, mitigating biases inherent in user-generated content. A number of Machine learning and deep learning models were pre-trained on a related large data set of Sinhala News comments in order to report the zero-shot result of our Sinhala YouTube comment data set. An optimized Multi-Layer Perceptron model, after extensive hyperparameter tuning, achieved a ROC-AUC score of 0.887. The model is a three-layer MLP with a configuration of 256, 128, and 64 neurons. This research contributes a valuable annotated dataset and provides insights for future work in Sinhala Natural Language Processing and music emotion recognition.