Do Latent Tokens Think? A Causal and Adversarial Analysis of Chain-of-Continuous-Thought

📅 2025-12-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether the latent tokens in COCONUT genuinely support reasoning or merely serve as uninterpretable placeholders that exploit dataset shortcuts. Addressing this problem, we adopt a dual analytical framework grounded in causal intervention—specifically token-level steering—and adversarial robustness. Our methodologically novel analysis is the first to systematically expose COCONUT’s pseudo-reasoning mechanism: latent tokens do not encode transferable logical structures but instead heavily rely on spurious statistical correlations present in the training distribution. Empirical evaluation on MMLU and HotpotQA confirms that COCONUT’s apparent performance gains stem from overfitting to in-distribution shortcuts—not genuine reasoning enhancement—as evidenced by substantial degradation under bias-sensitivity tests and out-of-distribution generalization benchmarks. This study falsifies COCONUT’s purported reasoning representation capability and establishes a new methodological framework for rigorously assessing the credibility of continuous-thinking paradigms.

Technology Category

Application Category

📝 Abstract
Latent tokens are gaining attention for enhancing reasoning in large language models (LLMs), yet their internal mechanisms remain unclear. This paper examines the problem from a reliability perspective, uncovering fundamental weaknesses: latent tokens function as uninterpretable placeholders rather than encoding faithful reasoning. While resistant to perturbation, they promote shortcut usage over genuine reasoning. We focus on Chain-of-Continuous-Thought (COCONUT), which claims better efficiency and stability than explicit Chain-of-Thought (CoT) while maintaining performance. We investigate this through two complementary approaches. First, steering experiments perturb specific token subsets, namely COCONUT and explicit CoT. Unlike CoT tokens, COCONUT tokens show minimal sensitivity to steering and lack reasoning-critical information. Second, shortcut experiments evaluate models under biased and out-of-distribution settings. Results on MMLU and HotpotQA demonstrate that COCONUT consistently exploits dataset artifacts, inflating benchmark performance without true reasoning. These findings reposition COCONUT as a pseudo-reasoning mechanism: it generates plausible traces that conceal shortcut dependence rather than faithfully representing reasoning processes.
Problem

Research questions and friction points this paper is trying to address.

Examines latent tokens' reliability in reasoning, revealing uninterpretable placeholder functions.
Investigates COCONUT's shortcut usage over genuine reasoning via steering and adversarial experiments.
Demonstrates COCONUT exploits dataset artifacts, inflating performance without true reasoning processes.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent tokens lack interpretability and reasoning fidelity
COCONUT tokens show minimal sensitivity to steering perturbations
COCONUT exploits dataset artifacts to inflate benchmark performance
🔎 Similar Papers
No similar papers found.
Yuyi Zhang
Yuyi Zhang
South China University of Technology
Computer VisionDiffusionImage generationHandwritten Character RecognitionOCR
B
Boyu Tang
Shanghai Jiao Tong University
Tianjie Ju
Tianjie Ju
Shanghai Jiao Tong University
Natural Langeuage Processing
S
Sufeng Duan
Shanghai Jiao Tong University
G
Gongshen Liu
Shanghai Jiao Tong University