Language Models Change Facts Based on the Way You Talk

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models (LLMs) exhibit systematic biases in high-stakes domains—including healthcare, law, politics, welfare, and salary recommendation—when exposed to implicit identity markers (e.g., race, gender, age) embedded in user inputs, leading to disparities in clinical standards, inequitable compensation, and distorted factual judgments. This study presents the first comprehensive empirical investigation into how subtle linguistic identity cues implicitly steer LLM decision-making. Leveraging large-scale models, fine-grained textual analysis, and rigorously controlled cross-domain experiments, we quantify response biases induced by demographic attributes. We introduce a novel identity-encoding evaluation framework that uncovers structural inequities: notably, lower salary recommendations for non-White job applicants, anomalously higher offers for women, and age-correlated ideological stereotyping. Our work establishes a theoretically grounded, reproducible methodology for fairness assessment and bias mitigation in LLMs.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly being used in user-facing applications, from providing medical consultations to job interview advice. Recent research suggests that these models are becoming increasingly proficient at inferring identity information about the author of a piece of text from linguistic patterns as subtle as the choice of a few words. However, little is known about how LLMs use this information in their decision-making in real-world applications. We perform the first comprehensive analysis of how identity markers present in a user's writing bias LLM responses across five different high-stakes LLM applications in the domains of medicine, law, politics, government benefits, and job salaries. We find that LLMs are extremely sensitive to markers of identity in user queries and that race, gender, and age consistently influence LLM responses in these applications. For instance, when providing medical advice, we find that models apply different standards of care to individuals of different ethnicities for the same symptoms; we find that LLMs are more likely to alter answers to align with a conservative (liberal) political worldview when asked factual questions by older (younger) individuals; and that LLMs recommend lower salaries for non-White job applicants and higher salaries for women compared to men. Taken together, these biases mean that the use of off-the-shelf LLMs for these applications may cause harmful differences in medical care, foster wage gaps, and create different political factual realities for people of different identities. Beyond providing an analysis, we also provide new tools for evaluating how subtle encoding of identity in users' language choices impacts model decisions. Given the serious implications of these findings, we recommend that similar thorough assessments of LLM use in user-facing applications are conducted before future deployment.

Problem

Research questions and friction points this paper is trying to address.

Analyzing how identity markers bias LLM responses in high-stakes applications

Investigating racial, gender, and age influences on LLM decision-making

Evaluating harmful biases in LLM outputs for medical, legal, and job scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing identity markers in user queries

Evaluating bias in five high-stakes applications

Providing tools for assessing identity impact

🔎 Similar Papers

No similar papers found.

Authors to Follow