π€ AI Summary
This work addresses the perspective-dependent nature of social meaning in natural language, which conventional NLP systems often oversimplify by assigning a single label while disregarding interpretive differences across demographic groups. To systematically model this diversity of perspectives, the authors propose a demographic-conditioned fusion embedding approach that jointly incorporates textual and demographic features. The method is evaluated across zero-shot, few-shot, and fine-tuning learning paradigms on a dataset of 28k human annotations. Experimental results demonstrate that the proposed fusion model consistently and significantly outperforms text-only baselines under all learning strategies, achieving a relative improvement of 5.9%β6.5% in macro-averaged PR-AUC. Ablation studies with randomized demographic labels confirm the robustness of these gains, verifying that demographic attributes contribute genuine predictive signals rather than spurious correlations.
π Abstract
Social meaning in language is inherently perspectival, varying across annotator backgrounds, demographics, and ideological positions. However, most NLP systems collapse this variation into a single ground-truth label, ignoring the diversity of interpretations. In this work, we model social dimensions along a perspectivist spectrum, capturing how interpretations vary across demographic groups on a dataset consisting of 28k human annotations. We benchmark multiple modeling paradigms, including zero-shot, few-shot, and fine-tuned approaches, and propose fusion embeddings that integrate textual and demographic representations. Our fusion models yield consistent and statistically significant improvements over text-only baselines across all fusion strategies (+5.9-6.5% relative macro PR-AUC), with shuffle ablations confirming that demographic profiles carry genuine predictive signal rather than spurious correlations.