Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree

📅 2025-07-19

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work identifies a novel indirect prompt injection (IPI) attack surface in large language model (LLM)-based web navigation agents that rely on accessibility tree parsing of HTML. Attackers can stealthily inject triggers via maliciously crafted HTML elements across cross-site contexts, enabling behavioral hijacking—e.g., credential exfiltration or forced clicks. We propose the first general-purpose IPI attack paradigm targeting accessibility trees and introduce an efficient adversarial HTML trigger generation method combining Greedy Coordinate Gradient optimization with the BrowserGym framework, evaluated on Llama-3.1. Experiments demonstrate high success rates for both goal-directed and generalized attacks on real-world websites. Our code and interactive demo system are publicly released.

Technology Category

Application Category

📝 Abstract

This work demonstrates that LLM-based web navigation agents offer powerful automation capabilities but are vulnerable to Indirect Prompt Injection (IPI) attacks. We show that adversaries can embed universal adversarial triggers in webpage HTML to hijack agent behavior that utilizes the accessibility tree to parse HTML, causing unintended or malicious actions. Using the Greedy Coordinate Gradient (GCG) algorithm and a Browser Gym agent powered by Llama-3.1, our system demonstrates high success rates across real websites in both targeted and general attacks, including login credential exfiltration and forced ad clicks. Our empirical results highlight critical security risks and the need for stronger defenses as LLM-driven autonomous web agents become more widely adopted. The system software (https://github.com/sej2020/manipulating-web-agents) is released under the MIT License, with an accompanying publicly available demo website (http://lethaiq.github.io/attack-web-llm-agent).

Problem

Research questions and friction points this paper is trying to address.

LLM web agents vulnerable to HTML-based indirect prompt injection attacks

Adversaries hijack agents via HTML accessibility tree manipulation

Demonstrates security risks in autonomous web agents needing defenses

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes HTML accessibility tree for indirect prompt injection

Employs Greedy Coordinate Gradient algorithm for attacks

Demonstrates attacks with Browser Gym and Llama-3.1

🔎 Similar Papers

Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization