Capturing the Effects of Quantization on Trojans in Code LLMs

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This study investigates the impact of model quantization on the robustness of large language models (LLMs) for code generation against data poisoning—specifically, backdoor attacks. We systematically evaluate 2-, 4-, and 8-bit quantization on Llama-2-7b and CodeLlama-7b in SQL generation tasks, measuring both backdoor efficacy (trigger success rate) and functional performance (task accuracy). We propose a novel metric to quantify backdoor signal strength. Our key finding is that quantization significantly suppresses backdoor behavior in CodeLlama-7b: 4-bit quantization drastically reduces attack success while simultaneously improving task accuracy; in contrast, Llama-2-7b exhibits negligible change in backdoor susceptibility across all bit-widths. These results demonstrate that quantization—not only enables model compression but also confers an intrinsic security benefit, particularly for specialized architectures like CodeLlama, supporting a “compression-as-defense” paradigm. This work provides empirical evidence and new insights for secure, efficient deployment of code-generation LLMs.

Technology Category

Application Category

📝 Abstract

Large language models of code exhibit high capability in performing diverse software engineering tasks, such as code translation, defect detection, text-to-code generation, and code summarization. While their ability to enhance developer productivity has spurred widespread use, these models have also seen substantial growth in size, often reaching billions of parameters. This scale demands efficient memory resource usage, prompting practitioners to use optimization techniques such as model quantization. Quantization uses smaller bit representations for the model parameters, reducing the precision of the weights. In this work, we investigate the impact of quantization on the risk of data poisoning attacks on these models, specifically examining whether it mitigates or exacerbates such vulnerabilities. We focus on two large language models, Meta's Llama-2-7b and CodeLlama-7b, applied to an SQL code generation task. Additionally, we introduce a new metric for measuring trojan signals in compromised models. We find that quantization has differing effects on code-generating LLMs: while reducing precision does not significantly alter Llama-2's behavior, it boosts performance and reduces attack success rates in CodeLlama, particularly at 4-bit precision.

Problem

Research questions and friction points this paper is trying to address.

Investigates impact of quantization on data poisoning risks in code LLMs

Examines if quantization mitigates or exacerbates trojan vulnerabilities

Introduces new metric for measuring trojan signals in compromised models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Investigates quantization impact on trojan vulnerabilities

Introduces new metric for measuring trojan signals

Tests 4-bit precision effects on CodeLlama performance

🔎 Similar Papers

No similar papers found.