SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the limitation of monotonous latent-space exploration paths during large language model (LLM) inference—which hinders performance gains—this paper proposes Test-time Scalable Soft Chain-of-Thought (SoftCoT), a parameter-free method that diversifies reasoning trajectories in the continuous latent space. Specifically, it perturbs hidden states via multiple task-specific initial tokens and employs contrastive learning to guide the generation of diverse reasoning paths. Furthermore, a novel soft chain decoding mechanism is introduced to enable robust and scalable latent-space path exploration. Evaluated on five mainstream reasoning benchmarks across two distinct LLM architectures, SoftCoT consistently outperforms baselines including standard SoftCoT and self-consistency, demonstrating strong architectural compatibility. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Test-Time Scaling (TTS) refers to approaches that improve reasoning performance by allocating extra computation during inference, without altering the model's parameters. While existing TTS methods operate in a discrete token space by generating more intermediate steps, recent studies in Coconut and SoftCoT have demonstrated that thinking in the continuous latent space can further enhance the reasoning performance. Such latent thoughts encode informative thinking without the information loss associated with autoregressive token generation, sparking increased interest in continuous-space reasoning. Unlike discrete decoding, where repeated sampling enables exploring diverse reasoning paths, latent representations in continuous space are fixed for a given input, which limits diverse exploration, as all decoded paths originate from the same latent thought. To overcome this limitation, we introduce SoftCoT++ to extend SoftCoT to the Test-Time Scaling paradigm by enabling diverse exploration of thinking paths. Specifically, we perturb latent thoughts via multiple specialized initial tokens and apply contrastive learning to promote diversity among soft thought representations. Experiments across five reasoning benchmarks and two distinct LLM architectures demonstrate that SoftCoT++ significantly boosts SoftCoT and also outperforms SoftCoT with self-consistency scaling. Moreover, it shows strong compatibility with conventional scaling techniques such as self-consistency. Source code is available at https://github.com/xuyige/SoftCoT.

Problem

Research questions and friction points this paper is trying to address.

Enhancing reasoning performance via continuous latent space exploration

Overcoming limited diverse exploration in fixed latent representations

Improving SoftCoT with diverse thinking paths and contrastive learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Perturbs latent thoughts via specialized initial tokens

Applies contrastive learning for diverse soft thoughts

Enables diverse exploration in continuous latent space

🔎 Similar Papers

2024-08-24arXiv.orgCitations: 0