Asking For It: Question-Answering for Predicting Rule Infractions in Online Content Moderation

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of content moderation arising from heterogeneous, dynamically evolving, and inconsistently enforced community rules in online platforms, this paper proposes ModQ—the first lightweight framework that formalizes rule-sensitive content moderation as a question-answering (QA) task. ModQ explicitly identifies the most likely violated community rule for a given comment via extractive and multiple-choice QA, enabling zero-shot transfer to unseen communities and newly introduced rules. It treats community rule texts as QA contexts and is trained on large-scale Reddit and Lemmy data. Empirically, ModQ significantly outperforms state-of-the-art baselines in violation detection. Its core contributions include: (i) the first QA-based formalization of rule enforcement; (ii) strong generalization across communities and rules; (iii) high interpretability through rule attribution; and (iv) low-resource adaptability—demonstrated via effective cross-platform deployment and real-world applicability.

Technology Category

Application Category

📝 Abstract
Online communities rely on a mix of platform policies and community-authored rules to define acceptable behavior and maintain order. However, these rules vary widely across communities, evolve over time, and are enforced inconsistently, posing challenges for transparency, governance, and automation. In this paper, we model the relationship between rules and their enforcement at scale, introducing ModQ, a novel question-answering framework for rule-sensitive content moderation. Unlike prior classification or generation-based approaches, ModQ conditions on the full set of community rules at inference time and identifies which rule best applies to a given comment. We implement two model variants - extractive and multiple-choice QA - and train them on large-scale datasets from Reddit and Lemmy, the latter of which we construct from publicly available moderation logs and rule descriptions. Both models outperform state-of-the-art baselines in identifying moderation-relevant rule violations, while remaining lightweight and interpretable. Notably, ModQ models generalize effectively to unseen communities and rules, supporting low-resource moderation settings and dynamic governance environments.
Problem

Research questions and friction points this paper is trying to address.

Predicting rule violations in online content moderation
Modeling relationship between rules and enforcement at scale
Generalizing moderation to unseen communities and dynamic rules
Innovation

Methods, ideas, or system contributions that make the work stand out.

Question-answering framework for rule-sensitive content moderation
Conditions on full community rules during inference time
Generalizes effectively to unseen communities and rules
🔎 Similar Papers
No similar papers found.