SynGhost: Invisible and Universal Task-agnostic Backdoor Attack via Syntactic Transfer

📅 2024-02-29

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Pre-trained language models (PLMs) are vulnerable to task-agnostic backdoor attacks under data-driven training paradigms, compromising downstream task security. To address this, we propose SynGhost—a syntax-driven, implicit, and universal backdoor attack framework. Its core innovation lies in constructing, for the first time, an implicit trigger mechanism at the syntactic level—requiring neither explicit triggers nor predefined target classes. SynGhost injects multiple syntactic backdoors via corpus poisoning and introduces an adaptive contrastive learning-based target selection strategy coupled with a perception module to decouple interference among concurrent backdoors. It further incorporates entropy-based filtering (maxEntropy) and pretraining-space distribution uniformization for optimization. Experiments demonstrate that SynGhost achieves high attack success rates across diverse PLMs and downstream tasks, while exhibiting strong robustness against perplexity-based detection, fine-tuning pruning, and maxEntropy defenses. The code is publicly available.

Technology Category

Application Category

📝 Abstract

Although pre-training achieves remarkable performance, it suffers from task-agnostic backdoor attacks due to vulnerabilities in data and training mechanisms. These attacks can transfer backdoors to various downstream tasks. In this paper, we introduce $mathtt{maxEntropy}$, an entropy-based poisoning filter that mitigates such risks. To overcome the limitations of manual target setting and explicit triggers, we propose $mathtt{SynGhost}$, an invisible and universal task-agnostic backdoor attack via syntactic transfer, further exposing vulnerabilities in pre-trained language models (PLMs). Specifically, $mathtt{SynGhost}$ injects multiple syntactic backdoors into the pre-training space through corpus poisoning, while preserving the PLM's pre-training capabilities. Second, $mathtt{SynGhost}$ adaptively selects optimal targets based on contrastive learning, creating a uniform distribution in the pre-training space. To identify syntactic differences, we also introduce an awareness module to minimize interference between backdoors. Experiments show that $mathtt{SynGhost}$ poses significant threats and can transfer to various downstream tasks. Furthermore, $mathtt{SynGhost}$ resists defenses based on perplexity, fine-pruning, and $mathtt{maxEntropy}$. The code is available at https://github.com/Zhou-CyberSecurity-AI/SynGhost.

Problem

Research questions and friction points this paper is trying to address.

Mitigates task-agnostic backdoor attacks in pre-training.

Proposes invisible syntactic transfer for backdoor injection.

Exposes vulnerabilities in pre-trained language models.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Syntactic transfer for backdoor attacks

Entropy-based poisoning filter

Contrastive learning for target selection

🔎 Similar Papers

A Survey of Recent Backdoor Attacks and Defenses in Large Language Models