T-SCEND: Test-time Scalable MCTS-enhanced Diffusion Model

📅 2025-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the energy inconsistency and poor test-time scalability of diffusion models in reasoning tasks, this paper proposes an Energy-Consistent Training and Phased Scalable Denoising framework. Methodologically: (1) We introduce a linear-regression-form negative contrastive learning objective jointly regularized by KL divergence to enforce smoothness and consistency in the energy landscape; (2) We propose hybrid Monte Carlo Tree Search (hMCTS), which integrates best-of-N stochastic sampling with tree search to enhance long-horizon reasoning capability while maintaining computational tractability. Experiments demonstrate that our approach achieves an 88% success rate on 15×15 maze solving—where baseline diffusion models completely fail—and significantly outperforms standard diffusion models on complex structured reasoning tasks such as Sudoku. This work is the first to empirically validate both the feasibility and scalability of energy-based models for combinatorial reasoning.

Technology Category

Application Category

📝 Abstract
We introduce Test-time Scalable MCTS-enhanced Diffusion Model (T-SCEND), a novel framework that significantly improves diffusion model's reasoning capabilities with better energy-based training and scaling up test-time computation. We first show that na""ively scaling up inference budget for diffusion models yields marginal gain. To address this, the training of T-SCEND consists of a novel linear-regression negative contrastive learning objective to improve the performance-energy consistency of the energy landscape, and a KL regularization to reduce adversarial sampling. During inference, T-SCEND integrates the denoising process with a novel hybrid Monte Carlo Tree Search (hMCTS), which sequentially performs best-of-N random search and MCTS as denoising proceeds. On challenging reasoning tasks of Maze and Sudoku, we demonstrate the effectiveness of T-SCEND's training objective and scalable inference method. In particular, trained with Maze sizes of up to $6 imes6$, our T-SCEND solves $88%$ of Maze problems with much larger sizes of $15 imes15$, while standard diffusion completely fails.Code to reproduce the experiments can be found at https://github.com/AI4Science-WestlakeU/t_scend.
Problem

Research questions and friction points this paper is trying to address.

Enhances diffusion model reasoning capabilities
Improves energy-based training efficiency
Scales up test-time computation effectively
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear-regression negative contrastive learning
KL regularization reduces adversarial sampling
Hybrid Monte Carlo Tree Search (hMCTS) integration
🔎 Similar Papers
No similar papers found.