Which Factors Make Code LLMs More Vulnerable to Backdoor Attacks? A Systematic Study

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

The vulnerability origins of backdoor attacks against code large language models (Code LLMs) remain poorly understood, particularly regarding underexplored factors in data (e.g., poisoning rate, trigger length and frequency), model training (e.g., batch size, epochs), and inference. Method: Using code summarization as a benchmark task, we conduct multi-factor controlled experiments and empirical evaluations to systematically analyze these factors. Contribution/Results: We demonstrate that extremely low poisoning rates (0.004%, i.e., only 20 out of 454K samples) suffice to implant highly effective backdoors—challenging the prevailing assumption that such rates are ineffective. Small batch sizes significantly exacerbate vulnerability, and mainstream defenses fail entirely under these conditions. Crucially, we identify trigger rarity, trigger length, and batch size as key vulnerability factors—the first such empirical characterization for Code LLMs. Our findings provide a reproducible, evidence-based foundation for designing robust defenses tailored to Code LLMs.

Technology Category

Application Category

📝 Abstract

Code LLMs are increasingly employed in software development. However, studies have shown that they are vulnerable to backdoor attacks: when a trigger (a specific input pattern) appears in the input, the backdoor will be activated and cause the model to generate malicious outputs. Researchers have designed various triggers and demonstrated the feasibility of implanting backdoors by poisoning a fraction of the training data. Some basic conclusions have been made, such as backdoors becoming easier to implant when more training data are modified. However, existing research has not explored other factors influencing backdoor attacks on Code LLMs, such as training batch size, epoch number, and the broader design space for triggers, e.g., trigger length. To bridge this gap, we use code summarization as an example to perform an empirical study that systematically investigates the factors affecting backdoor effectiveness and understands the extent of the threat posed. Three categories of factors are considered: data, model, and inference, revealing previously overlooked findings. We find that the prevailing consensus -- that attacks are ineffective at extremely low poisoning rates -- is incorrect. The absolute number of poisoned samples matters as well. Specifically, poisoning just 20 out of 454K samples (0.004% poisoning rate -- far below the minimum setting of 0.1% in prior studies) successfully implants backdoors! Moreover, the common defense is incapable of removing even a single poisoned sample from it. Additionally, small batch sizes increase the risk of backdoor attacks. We also uncover other critical factors such as trigger types, trigger length, and the rarity of tokens in the triggers, leading to valuable insights for assessing Code LLMs' vulnerability to backdoor attacks. Our study highlights the urgent need for defense mechanisms against extremely low poisoning rate settings.

Problem

Research questions and friction points this paper is trying to address.

Investigates factors affecting Code LLMs' vulnerability to backdoor attacks.

Examines impact of data, model, and inference parameters on backdoor success.

Challenges consensus on low poisoning rates and evaluates defense failures.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematically studies Code LLM backdoor attack factors

Reveals effectiveness at extremely low poisoning rates

Identifies critical factors like trigger types and length

🔎 Similar Papers

A Survey of Recent Backdoor Attacks and Defenses in Large Language Models

2024-06-10Citations: 9

How Well Do Large Language Models Serve as End-to-End Secure Code Producers?

2024-08-20arXiv.orgCitations: 2

Authors to Follow