Instruction Following by Boosting Attention of Large Language Models

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Large language models (LLMs) exhibit unreliable instruction following, and existing latent-guidance techniques deliver limited efficacy. Method: We propose InstABoost, a lightweight implicit control method that dynamically amplifies attention weights assigned to instruction tokens—enabling precise generation control without fine-tuning or additional parameters. Theoretically, we establish for the first time that attention weights in Transformers serve as explicit carriers of rule execution. Methodologically, we introduce the first attention-strength modulation paradigm for instruction guidance, comprising instruction-token localization, gain injection, and weight recalibration. Contribution/Results: We construct the first unified benchmark for evaluating implicit control behaviors. Experiments demonstrate that InstABoost improves instruction-following success rates by 23.6% on average across diverse tasks, significantly outperforming both prompt engineering and state-of-the-art implicit methods.

Technology Category

Application Category

📝 Abstract

Controlling the generation of large language models (LLMs) remains a central challenge to ensure their safe and reliable deployment. While prompt engineering and finetuning are common approaches, recent work has explored latent steering, a lightweight technique that alters LLM internal activations to guide generation. However, subsequent studies revealed latent steering's effectiveness to be limited, often underperforming simple instruction prompting. To address this limitation, we first establish a benchmark across diverse behaviors for standardized evaluation of steering techniques. Building on insights from this benchmark, we introduce Instruction Attention Boosting (InstABoost), a latent steering method that boosts the strength of instruction prompting by altering the model's attention during generation. InstABoost combines the strengths of existing approaches and is theoretically supported by prior work that suggests that in-context rule following in transformer-based models can be controlled by manipulating attention on instructions. Empirically, InstABoost demonstrates superior control success compared to both traditional prompting and latent steering.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM control via attention manipulation

Improving instruction following in language models

Benchmarking latent steering techniques for reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Boosts instruction attention in LLMs

Alters model attention during generation

Combines prompting and latent steering

🔎 Similar Papers

No similar papers found.