🤖 AI Summary
This study investigates the impact of model quantization on the robustness of large language models (LLMs) for code generation against data poisoning—specifically, backdoor attacks. We systematically evaluate 2-, 4-, and 8-bit quantization on Llama-2-7b and CodeLlama-7b in SQL generation tasks, measuring both backdoor efficacy (trigger success rate) and functional performance (task accuracy). We propose a novel metric to quantify backdoor signal strength. Our key finding is that quantization significantly suppresses backdoor behavior in CodeLlama-7b: 4-bit quantization drastically reduces attack success while simultaneously improving task accuracy; in contrast, Llama-2-7b exhibits negligible change in backdoor susceptibility across all bit-widths. These results demonstrate that quantization—not only enables model compression but also confers an intrinsic security benefit, particularly for specialized architectures like CodeLlama, supporting a “compression-as-defense” paradigm. This work provides empirical evidence and new insights for secure, efficient deployment of code-generation LLMs.
📝 Abstract
Large language models of code exhibit high capability in performing diverse software engineering tasks, such as code translation, defect detection, text-to-code generation, and code summarization. While their ability to enhance developer productivity has spurred widespread use, these models have also seen substantial growth in size, often reaching billions of parameters. This scale demands efficient memory resource usage, prompting practitioners to use optimization techniques such as model quantization. Quantization uses smaller bit representations for the model parameters, reducing the precision of the weights. In this work, we investigate the impact of quantization on the risk of data poisoning attacks on these models, specifically examining whether it mitigates or exacerbates such vulnerabilities. We focus on two large language models, Meta's Llama-2-7b and CodeLlama-7b, applied to an SQL code generation task. Additionally, we introduce a new metric for measuring trojan signals in compromised models. We find that quantization has differing effects on code-generating LLMs: while reducing precision does not significantly alter Llama-2's behavior, it boosts performance and reduces attack success rates in CodeLlama, particularly at 4-bit precision.