🤖 AI Summary
This study addresses sexism detection and intent attribution in English–Spanish bilingual social media. We propose a multi-stage approach integrating fine-tuned XLM-RoBERTa with few-shot prompting of GPT-3.5, pioneering the deep synergy between large language models’ in-context learning and pretrained multilingual encoders to enhance fine-grained discrimination robustness in low-resource bilingual settings. Evaluated on the EXIST 2024 shared task, our method ranks 4th (soft evaluation) in Task 1 (binary sexism detection) and 2nd (soft evaluation) in Task 2 (multi-class intent classification), substantially outperforming standard fine-tuning baselines. Key contributions include: (i) a transferable bilingual sexism analysis framework; (ii) empirical validation that parameter-efficient fine-tuning combined with prompt engineering significantly improves few-shot multilingual bias detection; and (iii) a novel paradigm for cross-lingual harmful content governance. The work advances both methodology and application in multilingual NLP for online safety.
📝 Abstract
Sexism in online content is a pervasive issue that necessitates effective classification techniques to mitigate its harmful impact. Online platforms often have sexist comments and posts that create a hostile environment, especially for women and minority groups. This content not only spreads harmful stereotypes but also causes emotional harm. Reliable methods are essential to find and remove sexist content, making online spaces safer and more welcoming. Therefore, the sEXism Identification in Social neTworks (EXIST) challenge addresses this issue at CLEF 2024. This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models. The tasks are to determine whether a text is sexist and what the source intention behind it is. We fine-tuned the XLM-RoBERTa model and separately used GPT-3.5 with few-shot learning prompts to classify sexist content. The XLM-RoBERTa model exhibited robust performance in handling complex linguistic structures, while GPT-3.5's few-shot learning capability allowed for rapid adaptation to new data with minimal labeled examples. Our approach using XLM-RoBERTa achieved 4th place in the soft-soft evaluation of Task 1 (sexism identification). For Task 2 (source intention), we achieved 2nd place in the soft-soft evaluation.