🤖 AI Summary
Existing uncertainty quantification methods for large language models (LLMs) exhibit poor alignment with human judgments of uncertainty, undermining controllability and user trust.
Method: We conduct the first systematic evaluation of over ten uncertainty metrics—including Bayesian estimation and top-k entropy—against human behavioral data. We identify top-k entropy and related local entropy measures as significantly outperforming global confidence-based metrics, and demonstrate their human-aligned consistency is robust across model scales. To mitigate reliance on model size, we propose a multi-metric weighted fusion framework.
Results: Our approach achieves substantially improved correlation between model-predicted uncertainty and human judgments across diverse LLMs (average Spearman ρ increase of 23.6%). It establishes a generalizable, interpretable paradigm for uncertainty modeling in trustworthy AI systems, enhancing both reliability and human-AI alignment.
📝 Abstract
Recent work has sought to quantify large language model uncertainty to facilitate model control and modulate user trust. Previous works focus on measures of uncertainty that are theoretically grounded or reflect the average overt behavior of the model. In this work, we investigate a variety of uncertainty measures, in order to identify measures that correlate with human group-level uncertainty. We find that Bayesian measures and a variation on entropy measures, top-k entropy, tend to agree with human behavior as a function of model size. We find that some strong measures decrease in human-similarity with model size, but, by multiple linear regression, we find that combining multiple uncertainty measures provide comparable human-alignment with reduced size-dependency.