The Echo Chamber Multi-Turn LLM Jailbreak

📅 2026-01-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes Echo Chamber, a novel jailbreaking attack targeting the vulnerability of large language models (LLMs) in multi-turn dialogues. Echo Chamber introduces a progressive escalation mechanism that leverages contextual manipulation across multiple turns, adversarial prompt engineering, and systematic interaction strategies to gradually steer the model into bypassing its safety constraints. Experimental results demonstrate that Echo Chamber significantly increases jailbreaking success rates across several mainstream LLMs while maintaining high stealthiness, thereby exposing critical weaknesses in current safety mechanisms when deployed in extended interactive scenarios.

Technology Category

Application Category

📝 Abstract
The availability of Large Language Models (LLMs) has led to a new generation of powerful chatbots that can be developed at relatively low cost. As companies deploy these tools, security challenges need to be addressed to prevent financial loss and reputational damage. A key security challenge is jailbreaking, the malicious manipulation of prompts and inputs to bypass a chatbot's safety guardrails. Multi-turn attacks are a relatively new form of jailbreaking involving a carefully crafted chain of interactions with a chatbot. We introduce Echo Chamber, a new multi-turn attack using a gradual escalation method. We describe this attack in detail, compare it to other multi-turn attacks, and demonstrate its performance against multiple state-of-the-art models through extensive evaluation.
Problem

Research questions and friction points this paper is trying to address.

jailbreaking
multi-turn attacks
Large Language Models
security
chatbots
Innovation

Methods, ideas, or system contributions that make the work stand out.

jailbreaking
multi-turn attack
large language models
prompt manipulation
safety guardrails
🔎 Similar Papers
No similar papers found.
A
Ahmad Alobaid
NeuralTrust
M
Mart'i Jorda Roca
NeuralTrust
Carlos Castillo
Carlos Castillo
ICREA Research Professor at Universitat Pompeu Fabra
Responsible ComputingAlgorithmic FairnessCrisis Informatics
J
Joan Vendrell
NeuralTrust