A Cautionary Tale About"Neutrally"Informative AI Tools Ahead of the 2025 Federal Elections in Germany

📅 2025-02-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study systematically evaluates the reliability of AI-based voting advice applications (VAAs) and large language models (LLMs) in delivering objective political information ahead of Germany’s 2025 federal election. Using Wahl-O-Mat–annotated party positions as ground truth, we employ (1) expert annotation of 38 policy statements, (2) LLM response consistency analysis, (3) bias auditing of VAA outputs, and (4) adversarial prompt testing. Our findings reveal, for the first time empirically, pronounced leftward political bias across mainstream LLMs—evidenced by >75% alignment with left-wing parties versus ≈30% with right-wing parties—and susceptibility to prompt injection inducing factual hallucinations. Both leading VAAs exhibit substantial output bias (25% and >50%, respectively), with one falsely associating parties with far-right ideologies. The study provides critical empirical evidence and a methodological framework to advance transparency, fairness, and accountability in AI-driven political tools.

Technology Category

Application Category

📝 Abstract

In this study, we examine the reliability of AI-based Voting Advice Applications (VAAs) and large language models (LLMs) in providing objective political information. Our analysis is based upon a comparison with party responses to 38 statements of the Wahl-O-Mat, a well-established German online tool that helps inform voters by comparing their views with political party positions. For the LLMs, we identify significant biases. They exhibit a strong alignment (over 75% on average) with left-wing parties and a substantially lower alignment with center-right (smaller 50%) and right-wing parties (around 30%). Furthermore, for the VAAs, intended to objectively inform voters, we found substantial deviations from the parties' stated positions in Wahl-O-Mat: While one VAA deviated in 25% of cases, another VAA showed deviations in more than 50% of cases. For the latter, we even observed that simple prompt injections led to severe hallucinations, including false claims such as non-existent connections between political parties and right-wing extremist ties.

Problem

Research questions and friction points this paper is trying to address.

Assess reliability of AI voting tools

Identify biases in language models

Detect deviations in voter advice accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-based Voting Advice Applications

Large Language Models analysis

Prompt injections and hallucinations

🔎 Similar Papers

No similar papers found.

Authors to Follow