Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This study addresses a critical security vulnerability in ChatGPT applications stemming from their shared, flat chat context and lack of multi-tenancy isolation, which renders them susceptible to cross-application context poisoning attacks. The work formally defines this novel attack vector, wherein a malicious application silently injects controllable content into the shared context via undocumented first-party APIs—such as sendFollowUpMessage—and parameters like systemPrompt and isVisible, thereby persistently poisoning the context and misleading benign applications. Through reverse engineering of runtime interfaces, the authors construct both conditional and imperative attack payloads, demonstrating their effectiveness across six mainstream language models and successfully realizing confused deputy attacks. Despite responsible disclosure, the underlying architectural flaw remains unpatched, highlighting a fundamental deficiency in the platform’s security mechanisms.

📝 Abstract

ChatGPT Apps, launched by OpenAI on Oct. 6, 2025, introduce an app-in-app paradigm in which third-party applications share a single chat context with the user and with every other connected app. The ecosystem grew from 122 apps in Dec. 2025 to 888 by May 2026, yet its security has remained uninvestigated. We identify cross-app context poisoning, a variant of indirect prompt injection distinguished by three properties: 1) the injection persists in the shared chat context across turns; 2) the effect surfaces through a different co-resident app the user later invokes; and 3) the delivery vectors are first-party APIs exposed to every connected app. We find multiple APIs capable of writing app-controlled content into the shared context, with sendFollowUpMessage as the most direct and potent channel. Two undocumented parameters that the runtime silently accepts, systemPrompt and isVisible, amplify this channel to silent, system-priority writes. Leveraging this channel, we realize a confused-deputy attack in which a malicious app poisons the context so that the LLM, consulting that context, enables manipulation against benign co-resident apps. We demonstrate two payload styles (conditional and imperative) and evaluate them across six current ChatGPT models. The root cause is architectural: the LLM's context is a persistent, flat, untagged data store shared by user and apps, with no isolation. Every mature multi-tenant platform, from Multics virtual memory to Android UIDs and iOS sandbox profiles, paid the isolation cost before admitting third parties; ChatGPT Apps did not. Fixing this requires an architectural change, not a patch. We disclosed our findings to OpenAI; the undocumented parameters remain accessible at the time of writing, and the architectural gap is by design: the shared context that enables cross-app composition is the same flat namespace that enables cross-app poisoning.

Problem

Research questions and friction points this paper is trying to address.

cross-app context poisoning

indirect prompt injection

shared context

first-party APIs

LLM security

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-app context poisoning

indirect prompt injection

first-party APIs