Large Language Models Empowered Personalized Web Agents

📅 2024-10-22

🏛️ arXiv.org

📈 Citations: 8

✨ Influential: 1

career value

195K/year

🤖 AI Summary

Existing web agents overlook user-specific data—such as user profiles and historical interaction traces—leading to generic instruction interpretation and action execution. This work formally defines the personalized web agent task for the first time. We introduce PersonalWAB, a benchmark comprising real-user memory traces for rigorous evaluation, and PUMA, a novel framework integrating a personalized memory repository, task-aware retrieval, LLM fine-tuning, and direct preference optimization (DPO). By enhancing memory grounding and aligning agent behavior with user intent, PUMA significantly improves both task success rate and intent consistency. Comprehensive experiments demonstrate that PUMA consistently outperforms state-of-the-art web agents on PersonalWAB across all metrics, validating that explicit personalization modeling is critical for advancing web agent performance.

Technology Category

Application Category

📝 Abstract

Web agents have emerged as a promising direction to automate Web task completion based on user instructions, significantly enhancing user experience. Recently, Web agents have evolved from traditional agents to Large Language Models (LLMs)-based Web agents. Despite their success, existing LLM-based Web agents overlook the importance of personalized data (e.g., user profiles and historical Web behaviors) in assisting the understanding of users' personalized instructions and executing customized actions. To overcome the limitation, we first formulate the task of LLM-empowered personalized Web agents, which integrate personalized data and user instructions to personalize instruction comprehension and action execution. To address the absence of a comprehensive evaluation benchmark, we construct a Personalized Web Agent Benchmark (PersonalWAB), featuring user instructions, personalized user data, Web functions, and two evaluation paradigms across three personalized Web tasks. Moreover, we propose a Personalized User Memory-enhanced Alignment (PUMA) framework to adapt LLMs to the personalized Web agent task. PUMA utilizes a memory bank with a task-specific retrieval strategy to filter relevant historical Web behaviors. Based on the behaviors, PUMA then aligns LLMs for personalized action execution through fine-tuning and direct preference optimization. Extensive experiments validate the superiority of PUMA over existing Web agents on PersonalWAB.

Problem

Research questions and friction points this paper is trying to address.

Enhancing Web agents with personalized user data for better task completion

Creating a benchmark (PersonalWAB) to evaluate personalized Web agent performance

Proposing PUMA framework to align LLMs with personalized Web tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates personalized data for customized actions

Constructs Personalized Web Agent Benchmark (PersonalWAB)

Proposes PUMA framework with memory bank retrieval

🔎 Similar Papers

No similar papers found.

ByteDance

圣何塞

Senior Machine Learning Engineer, AI Personalization

Block

Zone A:$228,700—$343,100 USD; Zone B: $217,300—$325,900 USD; Zone C:$205,900—$308,900 USD; Zone D:$194,500—$291,700 USD

Bay Area, CA, United States of America / US - CA - Bay Area - Remote

Research Engineer, Language - Personalization, Meta Superintelligence Labs