Structure-Induced Information for Rerooting Levin Tree Search

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the high computational overhead and scalability limitations of subgoal-based policy tree search in complex deterministic single-agent tasks, which stem from explicit subgoal generation. To overcome these challenges, the authors propose a learned “rerooter” mechanism integrated with the √LTS algorithm to enable implicit soft subtask decomposition, thereby eliminating the need for explicit subgoal construction and inference and allowing more efficient allocation of search resources. Three rerooter variants are introduced: one leveraging global state structure via clustering, another fusing learned heuristics with cost-to-go estimates, and a hybrid combining both strategies—collectively enabling scalable tree search without handcrafted rerooters for the first time. Experiments demonstrate that the approach significantly outperforms conventional subgoal-based tree search across multiple complex environments, achieving state-of-the-art online training efficiency and successfully scaling to problem sizes previously intractable for existing methods.

📝 Abstract

Subgoal-based policy tree search, which uses a policy to guide search, is effective for complex single-agent deterministic problems but often relies on explicit subgoal generation that can incur substantial overhead and hinders scalability. In this paper, we overcome these limitations by using a learned ``rerooter'' through the recently-introduced $\sqrt{\text{LTS}}$ algorithm. A rerooter implicitly decomposes the problem into soft subtasks. While previous work focused on the formal guarantees for given or handcrafted rerooters, in this work we propose three rerooter designs: (i) a clustering-based rerooter that exploits global state-space structure, (ii) a heuristic-based rerooter that leverages learned cost-to-go estimates, and (iii) a hybrid that combines both signals. Our framework avoids having to explicitly reconstruct and reason over generated subgoals, thereby enabling scalable allocation of search effort with significantly lower computational overhead. Empirically, our rerooting-based methods scale to complex environments where subgoal-based policy tree search fails, and achieve state-of-the-art online training efficiency on the domains tested.

Problem

Research questions and friction points this paper is trying to address.

subgoal-based policy tree search

scalability

computational overhead

single-agent deterministic problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

rerooting

Levin Tree Search

subgoal-free decomposition

scalable policy search