🤖 AI Summary
This work addresses the challenge of efficiently injecting instruction knowledge into Transformer models without gradient backpropagation, while preserving pre-trained capabilities and minimizing parameter updates. To this end, we propose QF (Query-based Fast learning), a feedforward rapid adaptation framework. QF computes weight updates via closed-form analytical solutions, enabling single-step, backpropagation-free knowledge consolidation and supporting unified training-inference deployment. Its core contribution is the first demonstration of fully backpropagation-free parameter adaptation in Transformers—achieving performance parity with conventional fine-tuning while reducing GPU memory consumption and training time by over 50%. The approach is both biologically inspired and highly resource-efficient. All code and models are publicly released.
📝 Abstract
We propose Quick Feedforward (QF) Learning, a novel knowledge consolidation framework for transformer-based models that enables efficient transfer of instruction derived knowledge into model weights through feedforward activations without any gradient back propagation. Unlike traditional finetuning, QF updates are computed in closed form, require minimal parameter modification, and preserve prior knowledge. Importantly, QF allows models to train and infer within the same runtime environment, making the process more resource efficient and closely aligned with how the human brain operates. Code and models are open sourced on GitHub. I hope QF Learning inspires a more efficient and brain-like paradigm for AI systems.