🤖 AI Summary
KFAC (Kronecker-Factored Approximate Curvature) has long suffered from poor reproducibility and engineering adoption due to its mathematical complexity, proliferation of variants, error-prone implementations, and lack of systematic validation. This work introduces the first unified KFAC implementation framework tailored for deep learning optimization, Bayesian inference, and model compression. It integrates automatic differentiation, matrix decomposition, and Kronecker-product approximations, ensuring strict alignment between theoretical derivations and PyTorch code. We propose a novel end-to-end numerical testing methodology and a verifiable code validation strategy, covering diverse neural architectures. The framework substantially lowers implementation barriers while enhancing reliability, scalability, and reproducibility. By providing standardized, empirically testable engineering practices, it establishes a rigorous foundation for second-order curvature approximation methods in modern deep learning systems.
📝 Abstract
Kronecker-factored approximate curvature (KFAC) is arguably one of the most prominent curvature approximations in deep learning. Its applications range from optimization to Bayesian deep learning, training data attribution with influence functions, and model compression or merging. While the intuition behind KFAC is easy to understand, its implementation is tedious: It comes in many flavours, has common pitfalls when translating the math to code, and is challenging to test, which complicates ensuring a properly functioning implementation. Some of the authors themselves have dealt with these challenges and experienced the discomfort of not being able to fully test their code. Thanks to recent advances in understanding KFAC, we are now able to provide test cases and a recipe for a reliable KFAC implementation. This tutorial is meant as a ground-up introduction to KFAC. In contrast to the existing work, our focus lies on providing both math and code side-by-side and providing test cases based on the latest insights into KFAC that are scattered throughout the literature. We hope this tutorial provides a contemporary view of KFAC that allows beginners to gain a deeper understanding of this curvature approximation while lowering the barrier to its implementation, extension, and usage in practice.