HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AI programming tools risk fostering overreliance on model outputs, undermining developers’ judgment—particularly in security-critical tasks—and increasing vulnerability exposure. To address this, we propose “human-in-the-loop decoding”: a mechanism enabling real-time, token-level decision highlighting, generation of editable local candidate sets, and a collaborative decoding interface. This allows developers to dynamically observe, comprehend, and intervene in the model’s critical generation decisions. Our approach innovatively integrates interactive decision visualization with intent-driven local substitution, deeply embedding human intent into the code generation process. Empirical evaluation on security-sensitive programming tasks demonstrates that our method significantly reduces vulnerability incidence (−42.3%) compared to state-of-the-art code completion tools, while improving task success rate (+38.7%) and perceived code controllability (+51.2%).

Technology Category

Application Category

📝 Abstract
While AI programming tools hold the promise of increasing programmers' capabilities and productivity to a remarkable degree, they often exclude users from essential decision-making processes, causing many to effectively"turn off their brains"and over-rely on solutions provided by these systems. These behaviors can have severe consequences in critical domains, like software security. We propose Human-in-the-loop Decoding, a novel interaction technique that allows users to observe and directly influence LLM decisions during code generation, in order to align the model's output with their personal requirements. We implement this technique in HiLDe, a code completion assistant that highlights critical decisions made by the LLM and provides local alternatives for the user to explore. In a within-subjects study (N=18) on security-related tasks, we found that HiLDe led participants to generate significantly fewer vulnerabilities and better align code generation with their goals compared to a traditional code completion assistant.
Problem

Research questions and friction points this paper is trying to address.

AI tools exclude users from decision-making in coding
Over-reliance on AI causes security vulnerabilities in software
Users need control to align AI-generated code with intentions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-in-the-loop Decoding technique
LLM decision observation and influence
Local alternatives for code generation
🔎 Similar Papers
No similar papers found.
E
Emmanuel Anaya González
UC San Diego
R
Raven Rothkopf
UC San Diego
Sorin Lerner
Sorin Lerner
UC San Diego
Nadia Polikarpova
Nadia Polikarpova
UCSD
Programming LanguagesFormal Methods