DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing automatic differentiation (AD) frameworks suffer from limited language support, intrusive code modifications, poor scientific computing performance, and high memory overhead—leading domain scientists to rely on manual gradient derivation. Method: We propose the first general-purpose, source-to-source AD engine tailored for high-performance computing (HPC) scientific applications. It integrates dataflow analysis with reverse-mode AD and introduces a novel integer linear programming (ILP)-based memory–recomputation co-optimization algorithm that maximizes computational efficiency under memory constraints. Contribution/Results: The engine unifies support for both machine learning and HPC scientific workloads, enabling—for the first time—efficient AD on HPC benchmarks such as NPBench. Experiments show it achieves over 92× speedup on average over JAX on NPBench, significantly overcoming the applicability bottleneck of AD in large-scale scientific computing.

Technology Category

Application Category

📝 Abstract

Automatic differentiation (AD) is a set of techniques that systematically applies the chain rule to compute the gradients of functions without requiring human intervention. Although the fundamentals of this technology were established decades ago, it is experiencing a renaissance as it plays a key role in efficiently computing gradients for backpropagation in machine learning algorithms. AD is also crucial for many applications in scientific computing domains, particularly emerging techniques that integrate machine learning models within scientific simulations and schemes. Existing AD frameworks have four main limitations: limited support of programming languages, requiring code modifications for AD compatibility, limited performance on scientific computing codes, and a naive store-all solution for forward-pass data required for gradient calculations. These limitations force domain scientists to manually compute the gradients for large problems. This work presents DaCe AD, a general, efficient automatic differentiation engine that requires no code modifications. DaCe AD uses a novel ILP-based algorithm to optimize the trade-off between storing and recomputing to achieve maximum performance within a given memory constraint. We showcase the generality of our method by applying it to NPBench, a suite of HPC benchmarks with diverse scientific computing patterns, where we outperform JAX, a Python framework with state-of-the-art general AD capabilities, by more than 92 times on average without requiring any code changes.

Problem

Research questions and friction points this paper is trying to address.

Addressing automatic differentiation limitations in machine learning

Optimizing gradient computation trade-offs for scientific computing

Enabling high-performance AD without requiring code modifications

Innovation

Methods, ideas, or system contributions that make the work stand out.

ILP-based algorithm optimizing storage-recomputation trade-off

No code modifications required for automatic differentiation

General efficient AD engine for HPC and ML

🔎 Similar Papers

No similar papers found.