🤖 AI Summary
This study addresses the challenge posed by highly concurrent, emoji-rich toxic comments in live-streaming platforms such as Twitch, which undermine conventional keyword-based or manual moderation approaches. To tackle this issue, the authors propose a hybrid toxicity detection method that explicitly integrates emoji semantics into the detection pipeline for the first time. Specifically, large language models—including DeepSeek-R1-Distill and Llama-3-8B-Instruct—are employed to generate joint embeddings of text and emojis, which are then fed into channel-specific classifiers based on Random Forest and Support Vector Machine (SVM) architectures. Experimental results demonstrate that the proposed approach achieves an accuracy of 80% and an F1 score of 76% on channel-specific data, representing a 13% accuracy improvement over BERT. These findings substantiate the critical role of emoji-aware mechanisms in enhancing toxicity detection in live-streaming contexts.
📝 Abstract
The rapid growth of live-streaming platforms such as Twitch has introduced complex challenges in moderating toxic behavior. Traditional moderation approaches, such as human annotation and keyword-based filtering, have demonstrated utility, but human moderators on Twitch constantly struggle to scale effectively in the fast-paced, high-volume, and context-rich chat environment of the platform while also facing harassment themselves. Recent advances in large language models (LLMs), such as DeepSeek-R1-Distill and Llama-3-8B-Instruct, offer new opportunities for toxicity detection, especially in understanding nuanced, multimodal communication involving emotes. In this work, we present an exploratory comparison of toxicity detection approaches tailored to Twitch. Our analysis reveals that incorporating emotes improves the detection of toxic behavior. To this end, we introduce ToxiTwitch, a hybrid model that combines LLM-generated embeddings of text and emotes with traditional machine learning classifiers, including Random Forest and SVM. In our case study, the proposed hybrid approach reaches up to 80 percent accuracy under channel-specific training (with 13 percent improvement over BERT and F1-score of 76 percent). This work is an exploratory study intended to surface challenges and limits of emote-aware toxicity detection on Twitch.