🤖 AI Summary
This paper addresses the challenge of accurately predicting response distributions across diverse demographic groups for subjective questions using large language models (LLMs). We propose a lightweight supervised alignment mechanism that replaces complex sociodemographic prompting with simple, universal group labels as supervision signals, guiding LLMs to learn consistent response distributions across groups. Our method is adaptable across multiple LLMs and prompting strategies on multi-topic datasets and supports quantitative evaluation of distributional alignment. Experiments demonstrate significant improvements in cross-group response distribution prediction accuracy across multiple benchmarks, with strong generalizability and model-agnostic performance. We open-source all code, data, and evaluation tools, establishing the first reproducible benchmark for cross-group response distribution alignment. This work introduces a novel paradigm for fairness-aware modeling and socially aware AI.
📝 Abstract
The ability to accurately predict how different population groups would answer subjective questions would have great value. In this work, we show that use of relatively simple supervision can greatly improve language model alignment with diverse population groups, as measured over three datasets spanning various topics. Beyond evaluating average performance, we also report how alignment varies across specific groups. The simplicity and generality of our approach promotes easy adoption, while our broad findings provide useful guidance for when to use or not use our approach in practice. By conducting evaluation over many LLMs and prompting strategies, along with open-sourcing our work, we provide a useful benchmark to stimulate future research.