🤖 AI Summary
Existing diffusion language models (DLMs) struggle to perform in-place semantic modifications of existing text without full sequence regeneration. This work proposes TimpaTeks, the first approach to integrate activation steering into DLMs, enabling direct control over the denoising process by manipulating internal activations. This technique facilitates structure-preserving, low-perplexity, and computationally efficient in-place editing without requiring instruction fine-tuning or complete re-generation. Experimental results demonstrate that TimpaTeks effectively reduces perplexity and preserves original syntactic structure on both the IMDB movie review dataset (for sentiment editing) and a synthetic cat-dog dataset (for unconventional concept injection), while incurring significantly lower computational overhead compared to prompt-based guidance methods.
📝 Abstract
We extend activation steering to diffusion language models (DLMs) and study a novel problem that arose due to the inference mechanism of DLMs: Modifying a text in-place to manifest a different concept. We propose TimpaTeks, an automatic in-place text modification mechanism using DLMs. Experiments on IMDB movie reviews (sentiment) and a synthetic Cats and Dogs Dataset (arbitrary, more unconventional concept steering) show that TimpaTeks provides a feasible novel mechanism to steer diffusion language model outputs in-place. TimpaTeks enables in-place modification while simultaneously lowers sentence perplexity and retaining the original sentence structre without the need of instruction tuned models. TimpaTeks is also computationally cheaper than prompt-based DLM steering, as it performs denoising in-place rather than constructing an additional prompt-conditioned output sequence.