Fine-Tuning LLMs with Noisy Data for Political Argument Generation

📅 2024-11-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Online political discourse on social media is frequently characterized by incivility, undermining democratic deliberation. Method: This study investigates strategies to enhance the civility and rhetorical quality of political argument generation by large language models (LLMs). Leveraging the CLAPTON dataset—comprising annotated political conversations from Twitter and Reddit across multiple dimensions—we systematically compare supervised fine-tuning (SFT) against multidimensional prompt engineering, incorporating principles of legitimacy, reciprocity, and non-offensiveness to mitigate toxicity. Contribution/Results: We provide the first empirical evidence that SFT on high-quality subsets (e.g., Reddit) significantly improves discourse quality; prompt-based interventions effectively reduce ad hominem attacks but fail to comprehensively suppress toxicity; only the synergistic integration of high-quality data and carefully designed prompts yields substantial improvements in both civility and rhetorical coherence of political discourse.

Technology Category

Application Category

📝 Abstract

The incivility in social media discourse complicates the deployment of automated text generation models for politically sensitive content. Fine-tuning and prompting strategies are critical, but underexplored, solutions to mitigate toxicity in such contexts. This study investigates the fine-tuning and prompting effects on GPT-3.5 Turbo using subsets of the CLAPTON dataset of political discussion posts, comprising Twitter and Reddit data labeled for their justification, reciprocity and incivility. Fine-tuned models on Reddit data scored highest on discussion quality, while combined noisy data led to persistent toxicity. Prompting strategies reduced specific toxic traits, such as personal attacks, but had limited broader impact. The findings emphasize that high-quality data and well-crafted prompts are essential to reduce incivility and improve rhetorical quality in automated political discourse generation.

Problem

Research questions and friction points this paper is trying to address.

Addressing AI-generated political argument incivility from social media

Evaluating fine-tuning impacts on argument quality and toxicity

Developing guidelines for LLM use in political deliberation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned GPT-3.5 Turbo on political datasets

Evaluated rhetorical quality with cross-platform data

Introduced rubric for LLM rhetorical evaluation

🔎 Similar Papers

No similar papers found.

Authors to Follow