Go-Browse: Training Web Agents with Structured Exploration

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Web agents frequently fail to navigate unfamiliar websites due to insufficient environmental understanding and ineffective path planning. To address this, we propose a graph-search-driven structured exploration framework that formally models web navigation as a graph search process over reusable information—enabling cross-session knowledge sharing and scalable generation of high-quality navigation trajectories. Our method integrates graph search algorithms, tight coupling with the WebArena benchmark, fine-tuning of a 7B-language-model, and interactive, URL-grounded data collection. This yields a high-fidelity dataset comprising 10K successful trajectories and 40K interaction steps. Evaluated on WebArena, the fine-tuned model achieves a 21.7% task success rate—outperforming GPT-4o mini by 2.4 percentage points and establishing a new state-of-the-art for models of comparable scale.

Technology Category

Application Category

📝 Abstract
One of the fundamental problems in digital agents is their lack of understanding of their environment. For instance, a web browsing agent may get lost in unfamiliar websites, uncertain what pages must be visited to achieve its goals. To address this, we propose Go-Browse, a method for automatically collecting diverse and realistic web agent data at scale through structured exploration of web environments. Go-Browse achieves efficient exploration by framing data collection as a graph search, enabling reuse of information across exploration episodes. We instantiate our method on the WebArena benchmark, collecting a dataset of 10K successful task-solving trajectories and 40K interaction steps across 100 URLs. Fine-tuning a 7B parameter language model on this dataset achieves a success rate of 21.7% on the WebArena benchmark, beating GPT-4o mini by 2.4% and exceeding current state-of-the-art results for sub-10B parameter models by 2.9%.
Problem

Research questions and friction points this paper is trying to address.

Lack of web agents' understanding of their environment
Difficulty in navigating unfamiliar websites efficiently
Need for scalable data collection for web agent training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured exploration for web agent training
Graph search for efficient data collection
Fine-tuning language models with collected trajectories
🔎 Similar Papers
No similar papers found.