🤖 AI Summary
This study identifies three structural biases in large language models’ (LLMs) assessment of press freedom: systematic underestimation (71%–93% of 180 countries rated lower than ground truth), a “democratic paradox” wherein high-freedom countries exhibit anomalous misalignment, and pronounced home-country bias (7%–260% inflated scores for developers’ home nations across five models). Leveraging the World Press Freedom Index (WPFI), we conduct cross-model consistency analysis, bias attribution modeling, and geographic bias quantification across six mainstream LLMs. We introduce the first empirical AI value-alignment framework tailored to global institutional evaluation, demonstrating that LLMs not only fail to accurately reflect real-world democratic practices but also exhibit endogenous biases rooted in training data provenance and development context. This work establishes a critical benchmark and methodological foundation for AI governance, media literacy initiatives, and value calibration of foundation models.
📝 Abstract
As Large Language Models (LLMs) increasingly mediate global information access for millions of users worldwide, their alignment and biases have the potential to shape public understanding and trust in fundamental democratic institutions, such as press freedom. In this study, we uncover three systematic distortions in the way six popular LLMs evaluate press freedom in 180 countries compared to expert assessments of the World Press Freedom Index (WPFI). The six LLMs exhibit a negative misalignment, consistently underestimating press freedom, with individual models rating between 71% to 93% of countries as less free. We also identify a paradoxical pattern we term differential misalignment: LLMs disproportionately underestimate press freedom in countries where it is strongest. Additionally, five of the six LLMs exhibit positive home bias, rating their home countries' press freedoms more favorably than would be expected given their negative misalignment with the human benchmark. In some cases, LLMs rate their home countries between 7% to 260% more positively than expected. If LLMs are set to become the next search engines and some of the most important cultural tools of our time, they must ensure accurate representations of the state of our human and civic rights globally.