Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In health-related online discussions, user toxicity frequently triggers social conflict and promotes pseudoscientific behavior; conventional post-hoc detection-and-removal strategies often backfire. This paper introduces a “predictive intervention” paradigm—first applying collaborative filtering to model cross-user–subcommunity toxic interaction likelihood, enabling pre-participation prediction of whether a given user will post toxic content in a specific health subcommunity (e.g., COVID-19–related Reddit forums), without relying on reactive moderation. Our method jointly encodes user behavioral representations and subcommunity-level semantic features to enable fine-grained modeling of toxicity compatibility. Evaluated on real-world COVID-19 Reddit data, the model achieves AUC and F1 scores exceeding 80%, substantially enhancing decision support for conflict avoidance. Core contributions include: (1) establishing the first predictive framework for toxicity interactions, and (2) shifting toxicity governance from reactive mitigation to proactive prevention.

Technology Category

Application Category

📝 Abstract
In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics, and allowing us to prevent the pairing of conflicting users and subcommunities.
Problem

Research questions and friction points this paper is trying to address.

Predict toxicity in health-related online discussions
Prevent conflicts between users and subcommunities
Improve moderation using machine learning techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Predict toxicity using Machine Learning
Collaborative Filtering-based methodology
Prevent conflicts in health discussions
🔎 Similar Papers
No similar papers found.
Jorge Paz-Ruza
Jorge Paz-Ruza
Interim Professor, Universidade da Coruña
Frugal Machine LearningResponsible AIGreen AI
A
A. Alonso-Betanzos
Universidade da Coruña, CITIC - LIDIA Group, Campus de Elviña s/n 15071 A Coruña
B
Bertha Guijarro-Berdinas
Universidade da Coruña, CITIC - LIDIA Group, Campus de Elviña s/n 15071 A Coruña
Carlos Eiras-Franco
Carlos Eiras-Franco
Investigador postdoctoral, Universidade da Coruña
machine learningescalabilidad