EmoRAG: Evaluating RAG Robustness to Symbolic Perturbations

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies a novel vulnerability in retrieval-augmented generation (RAG) systems: extreme sensitivity to minimal symbolic perturbations—e.g., a single position-sensitive emoji such as “(@_@)”—which causes the retriever to return semantically irrelevant documents and induces severe hallucination in the generator. We term this phenomenon EmoRAG. Through systematic, cross-model (state-of-the-art retrievers + large language models) and cross-task (open-domain QA and code generation) experiments, we demonstrate that such perturbations can nearly 100% hijack RAG outputs (F1 anomaly ≥ 0.92), with larger LLMs exhibiting heightened susceptibility. Existing robustness defenses fail entirely against EmoRAG. To address this, we propose and empirically validate a lightweight, retrieval-generation co-verification defense mechanism that significantly mitigates attack efficacy while imposing negligible computational overhead.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) systems are increasingly central to robust AI, enhancing large language model (LLM) faithfulness by incorporating external knowledge. However, our study unveils a critical, overlooked vulnerability: their profound susceptibility to subtle symbolic perturbations, particularly through near-imperceptible emoticon tokens such as "(@_@)" that can catastrophically mislead retrieval, termed EmoRAG. We demonstrate that injecting a single emoticon into a query makes it nearly 100% likely to retrieve semantically unrelated texts that contain a matching emoticon. Our extensive experiment across general question-answering and code domains, using a range of state-of-the-art retrievers and generators, reveals three key findings: (I) Single-Emoticon Disaster: Minimal emoticon injections cause maximal disruptions, with a single emoticon almost 100% dominating RAG output. (II) Positional Sensitivity: Placing an emoticon at the beginning of a query can cause severe perturbation, with F1-Scores exceeding 0.92 across all datasets. (III) Parameter-Scale Vulnerability: Counterintuitively, models with larger parameters exhibit greater vulnerability to the interference. We provide an in-depth analysis to uncover the underlying mechanisms of these phenomena. Furthermore, we raise a critical concern regarding the robustness assumption of current RAG systems, envisioning a threat scenario where an adversary exploits this vulnerability to manipulate the RAG system. We evaluate standard defenses and find them insufficient against EmoRAG. To address this, we propose targeted defenses, analyzing their strengths and limitations in mitigating emoticon-based perturbations. Finally, we outline future directions for building robust RAG systems.
Problem

Research questions and friction points this paper is trying to address.

Evaluates RAG vulnerability to emoticon perturbations
Analyzes how single emoticons disrupt retrieval and generation
Proposes defenses against symbolic attacks on RAG systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Injecting emoticons to expose RAG vulnerability
Analyzing positional sensitivity and parameter-scale effects
Proposing targeted defenses against emoticon-based perturbations
🔎 Similar Papers
No similar papers found.
X
Xinyun Zhou
ZJU, Hangzhou, China
X
Xinfeng Li
NTU, Singapore
Y
Yinan Peng
Hengxin Tech., Singapore
M
Ming Xu
NUS, Singapore
X
Xuanwang Zhang
NJU, Nanjing, China
M
Miao Yu
NTU, Singapore
Y
Yidong Wang
PKU, Beijing, China
Xiaojun Jia
Xiaojun Jia
Nanyang Technological University
Explainable AIRobust AIEfficient AI
K
Kun Wang
NTU, Singapore
Q
Qingsong Wen
Squirrel Ai Learning, Seattle, WA, USA
XiaoFeng Wang
XiaoFeng Wang
Chair, ACM SIGSAC
AI-Centered SecuritySystems Security and PrivacyHealthcare PrivacyIncentive Engineering
W
Wei Dong
NTU, Singapore