🤖 AI Summary
This study addresses the underexplored NLP task of detecting anti-autistic ableist language—a highly context-dependent and implicitly expressed phenomenon. We introduce AutismAbleismBank, the first context-aware benchmark dataset for this task, comprising 2,400 Reddit sentences with expert annotations. We formally define anti-autistic ableist language for the first time, propose fine-grained contextual labels, and include inter-annotator disagreement metrics. Annotation followed a collaborative, expert-led paradigm grounded in neurodiversity principles, incorporating multiple rounds of quality control. Experimental results reveal that state-of-the-art language models—including leading LLMs—perform substantially below human annotators. The dataset is publicly released to support the development of inclusive, context-sensitive NLP systems. This work establishes a foundational resource for advancing equitable natural language processing and fostering algorithmic accountability in mental health– and disability-related domains.
📝 Abstract
As our understanding of autism and ableism continues to increase, so does our understanding of ableist language towards autistic people. Such language poses a significant challenge in NLP research due to its subtle and context-dependent nature. Yet, detecting anti-autistic ableist language remains underexplored, with existing NLP tools often failing to capture its nuanced expressions. We present AUTALIC, the first benchmark dataset dedicated to the detection of anti-autistic ableist language in context, addressing a significant gap in the field. The dataset comprises 2,400 autism-related sentences collected from Reddit, accompanied by surrounding context, and is annotated by trained experts with backgrounds in neurodiversity. Our comprehensive evaluation reveals that current language models, including state-of-the-art LLMs, struggle to reliably identify anti-autistic ableism and align with human judgments, underscoring their limitations in this domain. We publicly release AUTALIC along with the individual annotations which serve as a valuable resource to researchers working on ableism, neurodiversity, and also studying disagreements in annotation tasks. This dataset serves as a crucial step towards developing more inclusive and context-aware NLP systems that better reflect diverse perspectives.