From `May' to `Is': Certainty Distortion in Language Model Rewriting

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses a critical yet previously unquantified issue: language models systematically amplify the certainty of scientific and medical texts during rewriting, even when preserving semantic content, thereby misleading user judgment. The work introduces a novel metric grounded in human consensus to evaluate textual certainty and analyzes this phenomenon through cross-model-family rewriting experiments and prompt-based interventions. Findings reveal that up to 75% of rewritten outputs exhibit distorted certainty levels, with most models increasing the likelihood of recasting uncertain statements as definitive by 1.5–2 times. Repeated rewriting further elevates certainty in 40% of medical text samples, while current prompting strategies only partially mitigate the issue. This research highlights the risk of certainty drift in high-stakes domains and establishes a new evaluation paradigm aligned with human judgment.

📝 Abstract

Humans increasingly turn to Language Models (LMs) in ways that shape beliefs and drive decisions, including discussing, rewriting, and summarizing information from scientific articles, news, and medical reports. However, in these domains, where how confidently a claim is expressed matters, little is known about whether LMs faithfully preserve it. In this work, we investigate certainty distortion in LMs, defined as meaningful changes in expressed certainty when semantic content is preserved. We propose an LM-based evaluation metric that is consistent with population-level judgments of certainty. Using this metric, we characterize certainty distortion across different sizes and families of models in the context of scientific and medical communication tasks. Our results show that certainty distortion affects up to 75\% of LM outputs and is systematically asymmetric in rewriting tasks with most LMs being 1.5-2$\times$ more likely to increase the expressed certainty than to decrease it. These effects can compound over repeated paraphrasing: in the medical domain, claude-haiku-4-5 increases certainty of 20\% examples after a single iteration, increasing to 40\% after five iterations. Prompt-based interventions reduce overall certainty distortion but do not eliminate it. Together, these findings reveal a general bias toward inflating expressed certainty, with direct implications for users who rely on LMs in high-stakes domains.

Problem

Research questions and friction points this paper is trying to address.

certainty distortion

language models

rewriting

scientific communication

medical reports

Innovation

Methods, ideas, or system contributions that make the work stand out.

certainty distortion

language model bias

faithful rewriting