What Am I Missing? Question-Answering as Hidden State Probing

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Large language models often exhibit inconsistent outputs despite identical inputs, and the underlying mechanisms remain poorly understood. This work conceptualizes questioning as a probing signal that interrogates the model’s hidden states. By training probes within a student–teacher framework, the study analyzes shifts in hidden representations before and after questioning to predict answer correctness. Furthermore, it formulates questioning as a sequential decision-making problem and introduces a gating strategy informed by probe feedback to optimize reasoning trajectories. Experimental results demonstrate that the probes effectively predict trajectory correctness; however, a gap persists between diagnosis and correction—while questioning can identify uncertainty, it may equally disrupt correct trajectories or repair erroneous ones.

📝 Abstract

Test-time reasoning has become a significant field of study since the introduction of chain-of-thought reasoning in large language models (LLMs). However, the mechanisms of this reasoning process are still under-explored -- from the same input prompt, and even the same partial solution, LLMs can produce varied answers if sampled multiple times. We propose to leverage question-asking as an inference-time intervention that articulates information about the model's hidden state. To achieve that, we present a student-teacher setting where a student asks questions to a teacher. We train a probe on the student's hidden state before and after asking a question and find it is predictive of the trajectory's final correctness, even before generating the teacher's answer. This suggests there is a meaningful signal from the self-diagnosis that occurs during question generation rather than information transfer from the teacher. We then frame question-asking as a sequential decision problem, using this probe as a quality score, and define a gating policy to ask questions that maximize likelihood of correctness. We find that the success of question-asking as an intervention is largely dependent on the model's self-consistency. Our empirical results show a gap between detection and recovery; while our gating policy captures model correctness and uncertainty, interventions are equally likely to harm correct trajectories as they are to recover incorrect ones. This gap between diagnosis and correction has broader implications on language models' capacity for self-refinement under uncertainty.

Problem

Research questions and friction points this paper is trying to address.

question-asking

hidden state probing

test-time reasoning

self-consistency

self-refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

hidden state probing

question-asking intervention

self-diagnosis