๐ค AI Summary
To address the challenges of label scarcity, heterogeneous task difficulty, and insufficient cross-task synergy in low-resource multimodal sentiment and intent recognition, this paper proposes a task-difficulty-aware interactive joint modeling framework. Methodologically, it employs pseudo-labeling to select high-confidence unlabeled samples, alleviating data bottlenecks; and introduces a multi-head attention mechanism atop a shared encoder to explicitly model bidirectional guidance between the relatively easier intent recognition and the more challenging sentiment analysis tasksโenabling difficulty-driven feature interaction and knowledge transfer. This work is the first to incorporate explicit task-difficulty modeling into multimodal multi-task learning, significantly enhancing collaborative learning efficacy. Evaluated on the ICASSP MEIJU@2025 Track I test set, the framework achieves a state-of-the-art score of 0.5532, securing first place in the competition.
๐ Abstract
This paper is the first-place solution for ICASSP MEIJU@2025 Track I, which focuses on low-resource multimodal emotion and intention recognition. How to effectively utilize a large amount of unlabeled data, while ensuring the mutual promotion of different difficulty levels tasks in the interaction stage, these two points become the key to the competition. In this paper, pseudo-label labeling is carried out on the model trained with labeled data, and samples with high confidence and their labels are selected to alleviate the problem of low resources. At the same time, the characteristic of easy represented ability of intention recognition found in the experiment is used to make mutually promote with emotion recognition under different attention heads, and higher performance of intention recognition is achieved through fusion. Finally, under the refined processing data, we achieve the score of 0.5532 in the Test set, and win the championship of the track.