🤖 AI Summary
This study addresses the prevalence of enthymemes—syllogisms with implicit premises or conclusions—in politically charged discourse, where annotation is inherently subjective yet often treated as uniform in existing research, thereby obscuring variations in human reasoning. To bridge this gap, the authors construct a dataset of 1,482 political tweets annotated by five raters following Walton’s argumentation schemes, accompanied by a novel annotation protocol that balances interpretive flexibility with constraints on subjectivity. Departing from conventional practice, the work treats inter-annotator disagreement not as noise but as a meaningful signal, integrating multi-annotator crowdsourcing, argument structure analysis, cognitive load assessment, and disagreement-aware model training. Experiments demonstrate that models leveraging disagreement information significantly outperform baselines trained on majority-vote hard labels, underscoring the value of annotator divergence in enhancing performance on reasoning tasks.
📝 Abstract
Enthymemes, arguments with unstated premises or conclusions, are pervasive in persuasive discourse, yet their annotation remains notoriously subjective. We present a resource of 1,482 tweets from politically controversial discourse, annotated by five annotators for the presence of enthymemes and their argument structure, designed to study label variation. We first revisit the definition of enthymemes and propose annotation guidelines anchored in Walton's argumentation schemes, offering a structured and constrained approach that nonetheless preserves room for the interpretive nature of the task. This contrasts with past resources, which tend to eliminate disagreement, obscuring its sources and preventing investigation of its potential benefits for model performance. We further propose a complexity analysis of the task, identifying where annotation imposes high cognitive load and may give rise to inconsistent annotation. Our preliminary experiments show that models trained on annotator disagreement outperform models trained on hard majority-vote labels. We close by reflecting on how structural openness in enthymeme definitions and guidelines enables the study of variation in subjective inferential processes for future resources and downstream NLP applications concerned with human inference.