🤖 AI Summary
This work addresses the challenge of flexibly adapting pretrained multi-robot policies to new tasks without fine-tuning or retraining, while avoiding catastrophic forgetting. The authors propose CLAE, a novel framework that formulates behavior steering as a closed-loop activation editing problem during inference. CLAE employs a sparse autoencoder to identify behavior-relevant latent variables and leverages a lightweight reinforcement learning policy to generate state-dependent affine transformations that dynamically modulate intermediate activations of the frozen policy. Experiments demonstrate that CLAE enables online adaptation on both simulated and physical quadrotor platforms—achieving new capabilities such as individual speed adjustment, multi-agent formation maintenance, and surveillance avoidance—without altering the original policy weights, while preserving baseline navigation performance.
📝 Abstract
Real-world robots need to adapt their behavior beyond the envelope of their pre-trained policy. Policy finetuning or retraining are options, but they risk catastrophic forgetting, degrading the pretrained policy's base performance. To combat this, we introduce CLAE: Closed-Loop Affine Activation Editing, an inference-time framework for steering the behavior of a frozen policy by editing intermediate activations while keeping the base policy weights and downstream action head untouched. CLAE approaches behavior steering as a closed-loop problem whose outputs edit policy activations that adapt online to the robot state, environment, target behavior, and multi-robot context. It trains a sparse autoencoder over frozen-policy activations, selects behavior-relevant latent features via post-hoc probing, and learns a lightweight RL-based steering policy that applies state-dependent affine edits to selected latents during inference. We validate CLAE on a frozen multi-quadrotor navigation policy trained to perform a single task: navigating robots to a set of goal locations while avoiding obstacles. Through extensive simulations and physical tests, we show that while navigating to their goal positions, CLAE can 1. steer individual robot behavior by controlling each robot's velocity profile; 2. coordinate multirobot behavior by preserving a desired formation; and 3. produce entirely new behavior wherein robots are required to reduce their exposure to surveillance cameras in the environment.