A Cautionary Tale About"Neutrally"Informative AI Tools Ahead of the 2025 Federal Elections in Germany

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the reliability of AI-based voting advice applications (VAAs) and large language models (LLMs) in delivering objective political information ahead of Germany’s 2025 federal election. Using Wahl-O-Mat–annotated party positions as ground truth, we employ (1) expert annotation of 38 policy statements, (2) LLM response consistency analysis, (3) bias auditing of VAA outputs, and (4) adversarial prompt testing. Our findings reveal, for the first time empirically, pronounced leftward political bias across mainstream LLMs—evidenced by >75% alignment with left-wing parties versus ≈30% with right-wing parties—and susceptibility to prompt injection inducing factual hallucinations. Both leading VAAs exhibit substantial output bias (25% and >50%, respectively), with one falsely associating parties with far-right ideologies. The study provides critical empirical evidence and a methodological framework to advance transparency, fairness, and accountability in AI-driven political tools.

Technology Category

Application Category

📝 Abstract
In this study, we examine the reliability of AI-based Voting Advice Applications (VAAs) and large language models (LLMs) in providing objective political information. Our analysis is based upon a comparison with party responses to 38 statements of the Wahl-O-Mat, a well-established German online tool that helps inform voters by comparing their views with political party positions. For the LLMs, we identify significant biases. They exhibit a strong alignment (over 75% on average) with left-wing parties and a substantially lower alignment with center-right (smaller 50%) and right-wing parties (around 30%). Furthermore, for the VAAs, intended to objectively inform voters, we found substantial deviations from the parties' stated positions in Wahl-O-Mat: While one VAA deviated in 25% of cases, another VAA showed deviations in more than 50% of cases. For the latter, we even observed that simple prompt injections led to severe hallucinations, including false claims such as non-existent connections between political parties and right-wing extremist ties.
Problem

Research questions and friction points this paper is trying to address.

Assess reliability of AI voting tools
Identify biases in language models
Detect deviations in voter advice accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-based Voting Advice Applications
Large Language Models analysis
Prompt injections and hallucinations
🔎 Similar Papers
No similar papers found.
Ina Dormuth
Ina Dormuth
Statistician, TU Dortmund University
Survival AnalysisMachine LearningBioinformaticsTime Series Analysis
Sven Franke
Sven Franke
Scientific Researcher, TU Dortmund University
Logistics
M
Marlies Hafer
Department of Statistics, TU Dortmund University, Research Center Trustworthy Data Science and Security, University Alliance Ruhr (UA Ruhr)
T
Tim Katzke
Research Center Trustworthy Data Science and Security, University Alliance Ruhr (UA Ruhr), Department of Computer Science, TU Dortmund University
Alexander Marx
Alexander Marx
TU Dortmund
CausalityCausal DiscoveryInformation TheoryRepresentation Learning
Emmanuel Müller
Emmanuel Müller
Professor of Computer Science, Technical University of Dortmund
Data MiningMachine LearningData ExplorationDatabases
Daniel Neider
Daniel Neider
TU Dortmund University and Center for Trustworthy Data Science and Security
Formal MethodsMachine LearningLogicArtificial Intelligence
M
Markus Pauly
Department of Statistics, TU Dortmund University, Research Center Trustworthy Data Science and Security, University Alliance Ruhr (UA Ruhr)
Jérôme Rutinowski
Jérôme Rutinowski
TU Dortmund University
Machine LearningComputer VisionLogistics