🤖 AI Summary
Existing approaches primarily focus on cross-modal alignment and text-image consistency, overlooking the semantic enhancement capabilities of large models and critical news sentiment features—particularly the strong correlation between negative sentiment and falsity. To address this, we propose a multimodal fake news detection framework that jointly integrates semantic enhancement and sentiment reasoning. First, we leverage a large multimodal model to generate image-descriptive textual summaries, thereby enriching semantic representations. Second, we design an expert-guided sentiment reasoning module to explicitly model the mapping between emotional polarity (especially negativity) and veracity. Third, we introduce a cross-modal alignment mechanism coupled with sentiment-aware fusion for end-to-end optimization. Extensive experiments on two real-world benchmark datasets demonstrate that our method consistently outperforms state-of-the-art models in both accuracy and F1-score, validating the synergistic benefits of integrating semantic enhancement with fine-grained sentiment modeling for robust fake news detection.
📝 Abstract
Previous studies on multimodal fake news detection mainly focus on the alignment and integration of cross-modal features, as well as the application of text-image consistency. However, they overlook the semantic enhancement effects of large multimodal models and pay little attention to the emotional features of news. In addition, people find that fake news is more inclined to contain negative emotions than real ones. Therefore, we propose a novel Semantic Enhancement and Emotional Reasoning (SEER) Network for multimodal fake news detection. We generate summarized captions for image semantic understanding and utilize the products of large multimodal models for semantic enhancement. Inspired by the perceived relationship between news authenticity and emotional tendencies, we propose an expert emotional reasoning module that simulates real-life scenarios to optimize emotional features and infer the authenticity of news. Extensive experiments on two real-world datasets demonstrate the superiority of our SEER over state-of-the-art baselines.