🤖 AI Summary
Automated ideological classification of news and political content in the U.S. two-party context remains challenging due to heavy reliance on large-scale manual annotations, poor generalizability, and inability to adapt to evolving ideological spectra.
Method: This paper proposes a few-shot, metadata-augmented approach leveraging large language models (LLMs), integrating source information, descriptive metadata, and cross-platform multimodal inputs (text + video descriptions).
Contributions/Results: First, it systematically investigates how metadata calibrates LLM-based ideological inference. Second, it introduces a label-balanced demonstration selection strategy to enhance robustness and generalizability of in-context learning (ICL) under dynamic ideological spectra. Third, it establishes a unified multimodal framework for heterogeneous platforms. Evaluated on three news and YouTube datasets, the method outperforms zero-shot baselines and conventional supervised models, achieving an average accuracy gain of 12.7%. Results validate the efficacy of metadata-driven, lightweight, and transferable ideological modeling.
📝 Abstract
The rapid growth of social media platforms has led to concerns about radicalization, filter bubbles, and content bias. Existing approaches to classifying ideology are limited in that they require extensive human effort, the labeling of large datasets, and are not able to adapt to evolving ideological contexts. This paper explores the potential of Large Language Models (LLMs) for classifying the political ideology of online content in the context of the two-party US political spectrum through in-context learning (ICL). Our extensive experiments involving demonstration selection in label-balanced fashion, conducted on three datasets comprising news articles and YouTube videos, reveal that our approach significantly outperforms zero-shot and traditional supervised methods. Additionally, we evaluate the influence of metadata (e.g., content source and descriptions) on ideological classification and discuss its implications. Finally, we show how providing the source for political and non-political content influences the LLM's classification.