Joint Effects of Argumentation Theory, Audio Modality and Data Enrichment on LLM-Based Fallacy Classification

📅 2025-09-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how contextual information and audio-based affective metadata influence large language models’ (LLMs) performance in fallacy classification within political debates. Methodologically, grounded in argumentation theory, we propose two chain-of-thought frameworks—Pragma-Dialectics and the Argumentative Elements Periodic Table—to systematically model inferential structure; experiments are conducted on Qwen-3 (8B) under three input conditions: plain text, context-augmented text, and context plus affective metadata. Results reveal that simple prompting outperforms complex prompting; critically, incorporating context and especially affective metadata significantly reduces classification accuracy—inducing systematic misclassification of *appeal to emotion* fallacies and causing attention dilution that impairs logical reasoning. This work provides the first empirical evidence of affective metadata’s disruptive effect in argument analysis, offering critical insights and theoretical grounding for designing trustworthy inputs for LLMs in higher-order critical reasoning tasks.

Technology Category

Application Category

📝 Abstract
This study investigates how context and emotional tone metadata influence large language model (LLM) reasoning and performance in fallacy classification tasks, particularly within political debate settings. Using data from U.S. presidential debates, we classify six fallacy types through various prompting strategies applied to the Qwen-3 (8B) model. We introduce two theoretically grounded Chain-of-Thought frameworks: Pragma-Dialectics and the Periodic Table of Arguments, and evaluate their effectiveness against a baseline prompt under three input settings: text-only, text with context, and text with both context and audio-based emotional tone metadata. Results suggest that while theoretical prompting can improve interpretability and, in some cases, accuracy, the addition of context and especially emotional tone metadata often leads to lowered performance. Emotional tone metadata biases the model toward labeling statements as extit{Appeal to Emotion}, worsening logical reasoning. Overall, basic prompts often outperformed enhanced ones, suggesting that attention dilution from added inputs may worsen rather than improve fallacy classification in LLMs.
Problem

Research questions and friction points this paper is trying to address.

Examining context and emotional tone effects on LLM fallacy classification
Evaluating argumentation theory frameworks for improving model interpretability
Assessing performance impact of multimodal inputs in debate analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Argumentation theory frameworks for LLM reasoning
Multi-modal input with emotional tone metadata
Contextual enrichment for fallacy classification
🔎 Similar Papers
No similar papers found.
H
Hongxu Zhou
University of Groningen
H
Hylke Westerdijk
University of Groningen
Khondoker Ittehadul Islam
Khondoker Ittehadul Islam
University of Groningen
Natural Language Processing