🤖 AI Summary
This work addresses the sentiment detection challenge across over 30 low-resource languages. We introduce ML-Emo, the first multilingual fine-grained sentiment benchmark supporting three tasks—monolingual classification, sentiment intensity regression, and cross-lingual transfer—spanning six core emotion categories. Methodologically, we integrate multilingual pretrained models (e.g., XLM-R), adapter-based fine-tuning, cross-lingual knowledge distillation, and multi-task joint learning. Crucially, ML-Emo incorporates multilingual sentiment intensity annotations and systematic cross-lingual generalization evaluation, establishing the broadest low-resource sentiment evaluation framework to date. The benchmark has attracted over 700 participants from 200+ teams, yielding 93 system papers. The complete dataset, baseline models, and top-performing solutions are publicly released under open-source licenses. This initiative significantly advances research in multilingual sentiment understanding, particularly for under-resourced languages.
📝 Abstract
We present our shared task on text-based emotion detection, covering more than 30 languages from seven distinct language families. These languages are predominantly low-resource and spoken across various continents. The data instances are multi-labeled into six emotional classes, with additional datasets in 11 languages annotated for emotion intensity. Participants were asked to predict labels in three tracks: (a) emotion labels in monolingual settings, (b) emotion intensity scores, and (c) emotion labels in cross-lingual settings. The task attracted over 700 participants. We received final submissions from more than 200 teams and 93 system description papers. We report baseline results, as well as findings on the best-performing systems, the most common approaches, and the most effective methods across various tracks and languages. The datasets for this task are publicly available.