🤖 AI Summary
This work addresses the explainable detection of ideological bias in political news. We propose a dual-independent-axis annotation paradigm—capturing Democratic/Republican sentiment polarity—and fine-grained rationale labeling, resulting in the BiasLab dataset of 300 news articles. Methodologically, we innovatively integrate human rationale indicators with schema-constrained GPT-4o simulation for scalable, consistent annotation; augment this with crowdsourced quality control, Krippendorff’s α inter-annotator agreement assessment, and perceptual drift modeling. Empirical analysis reveals systematic misjudgments by both humans and models on subtly right-leaning content, uncovering an asymmetric bias in right-leaning identification. We publicly release the dataset, annotation guidelines, and baseline code, establishing a new benchmark and methodological foundation for explainable bias analysis in computational social science and NLP.
📝 Abstract
We present BiasLab, a dataset of 300 political news articles annotated for perceived ideological bias. These articles were selected from a curated 900-document pool covering diverse political events and source biases. Each article is labeled by crowdworkers along two independent scales, assessing sentiment toward the Democratic and Republican parties, and enriched with rationale indicators. The annotation pipeline incorporates targeted worker qualification and was refined through pilot-phase analysis. We quantify inter-annotator agreement, analyze misalignment with source-level outlet bias, and organize the resulting labels into interpretable subsets. Additionally, we simulate annotation using schema-constrained GPT-4o, enabling direct comparison to human labels and revealing mirrored asymmetries, especially in misclassifying subtly right-leaning content. We define two modeling tasks: perception drift prediction and rationale type classification, and report baseline performance to illustrate the challenge of explainable bias detection. BiasLab's rich rationale annotations provide actionable interpretations that facilitate explainable modeling of political bias, supporting the development of transparent, socially aware NLP systems. We release the dataset, annotation schema, and modeling code to encourage research on human-in-the-loop interpretability and the evaluation of explanation effectiveness in real-world settings.