When LLMs Invent Rust Crates: An Empirical Study of Hallucination Patterns and Mitigation

📅 2026-06-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the critical issue of crate hallucination—where large language models (LLMs) generate fictitious dependencies—in Rust code generation, posing significant software supply chain security risks. It presents the first large-scale empirical investigation, constructing a multi-source dataset spanning Stack Overflow, GitHub, and LLM-generated tasks to systematically evaluate hallucination behaviors across prominent open-source and commercial models under various decoding strategies. The findings reveal that hallucination rates in Rust are remarkably consistent across models and largely insensitive to parameter scale, markedly differing from patterns observed in Python and JavaScript. Furthermore, the work introduces a prompt engineering approach that effectively mitigates hallucinations without compromising code quality, offering both empirical evidence and practical solutions for secure and reliable LLM-assisted Rust development.

📝 Abstract

Large Language Models (LLMs) have become powerful tools for code generation, yet they remain prone to hallucinations-producing plausible but incorrect or fabricated outputs. Among these, package hallucination, where an LLM suggests non-existent dependencies, poses an emerging security risk to the software supply chain. While previous studies focus on popular languages like Python or JavaScript, in this work we present the first large-scale empirical study on crate hallucination in LLM-generated Rust code. We construct a multi-source dataset combining coding tasks from Stack Overflow, GitHub, and LLM-generated tasks, and evaluate both commercial and open-source models under various decoding settings. Our analysis reveals that, unlike prior findings in Python and JavaScript, hallucination behavior in Rust follows a distinct pattern: different models exhibit surprisingly consistent hallucination rates, and these rates show minimal sensitivity to model parameters. Furthermore, we investigate prompt engineering strategies to mitigate hallucinations without sacrificing code quality. This study provides new insights into the reliability and security implications of LLM-assisted Rust development, offering guidance for future research and safer model deployment in software engineering workflows.

Problem

Research questions and friction points this paper is trying to address.

hallucination

Rust

software supply chain

code generation

dependency

Innovation

Methods, ideas, or system contributions that make the work stand out.

crate hallucination

Rust code generation

prompt engineering