MidSteer: Optimal Affine Framework for Steering Generative Models

📅 2026-04-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
Existing generative models lack a unified theoretical framework for concept manipulation and behavioral alignment in intermediate representations. This work proposes MidSteer, a general-purpose steering method based on affine transformations, which unifies concept erasure (e.g., LEACE) and concept switching under a closed-form affine intervention theory for the first time. MidSteer enables minimal-perturbation, targeted modifications to intermediate layer activations while circumventing restrictive assumptions inherent in prior approaches. The method is applicable across both diffusion models and large language models, demonstrating consistent superiority over existing techniques across diverse tasks, modalities, and architectures, thereby validating its effectiveness and broad generality.
📝 Abstract
Steering intermediate representations has emerged as a powerful strategy for controlling generative models, particularly in post-deployment alignment and safety settings. However, despite its empirical success, it currently lacks a comprehensive theoretical framework. In this paper, we bridge this gap by formalizing the theory of concept steering. First, we establish a link between steering and affine concept erasure, proving that the standard approach for removing unwanted behaviors is a special case of LEACE (a closed-form method for affine erasure). Next, we formulate a principled theoretical framework for concept switching, LEACE-Switch, and characterize the assumptions under which it provides an optimal affine solution. Building on this analysis, we then introduce MidSteer (Minimal Disturbance concept Steering), a more general affine framework for concept manipulation that relaxes these assumptions and enables directed, minimal-disturbance transformations. We demonstrate that MidSteer performs favorably across a range of tasks, modalities, and architectures, including vision diffusion models and large language models.
Problem

Research questions and friction points this paper is trying to address.

concept steering
affine erasure
generative models
theoretical framework
intermediate representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

concept steering
affine erasure
LEACE
MidSteer
generative models