T-SCEND: Test-time Scalable MCTS-enhanced Diffusion Model

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

To address the energy inconsistency and poor test-time scalability of diffusion models in reasoning tasks, this paper proposes an Energy-Consistent Training and Phased Scalable Denoising framework. Methodologically: (1) We introduce a linear-regression-form negative contrastive learning objective jointly regularized by KL divergence to enforce smoothness and consistency in the energy landscape; (2) We propose hybrid Monte Carlo Tree Search (hMCTS), which integrates best-of-N stochastic sampling with tree search to enhance long-horizon reasoning capability while maintaining computational tractability. Experiments demonstrate that our approach achieves an 88% success rate on 15×15 maze solving—where baseline diffusion models completely fail—and significantly outperforms standard diffusion models on complex structured reasoning tasks such as Sudoku. This work is the first to empirically validate both the feasibility and scalability of energy-based models for combinatorial reasoning.

Technology Category

Application Category

📝 Abstract

We introduce Test-time Scalable MCTS-enhanced Diffusion Model (T-SCEND), a novel framework that significantly improves diffusion model's reasoning capabilities with better energy-based training and scaling up test-time computation. We first show that na""ively scaling up inference budget for diffusion models yields marginal gain. To address this, the training of T-SCEND consists of a novel linear-regression negative contrastive learning objective to improve the performance-energy consistency of the energy landscape, and a KL regularization to reduce adversarial sampling. During inference, T-SCEND integrates the denoising process with a novel hybrid Monte Carlo Tree Search (hMCTS), which sequentially performs best-of-N random search and MCTS as denoising proceeds. On challenging reasoning tasks of Maze and Sudoku, we demonstrate the effectiveness of T-SCEND's training objective and scalable inference method. In particular, trained with Maze sizes of up to $6 imes6$, our T-SCEND solves $88%$ of Maze problems with much larger sizes of $15 imes15$, while standard diffusion completely fails.Code to reproduce the experiments can be found at https://github.com/AI4Science-WestlakeU/t_scend.

Problem

Research questions and friction points this paper is trying to address.

Enhances diffusion model reasoning capabilities

Improves energy-based training efficiency

Scales up test-time computation effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear-regression negative contrastive learning

KL regularization reduces adversarial sampling

Hybrid Monte Carlo Tree Search (hMCTS) integration

🔎 Similar Papers

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training