🤖 AI Summary
This work addresses the lack of formal semantics in deep learning architecture diagrams, which hinders rigorous equivalence proofs for tensor programs. The authors propose a graphical calculus that represents tensor axes—inspired by einops—as nested, hierarchical tube structures, thereby unifying perspectives from tensor networks and computational graphs. Central to this framework are hierarchical naturality rewrite rules, such as the “sliding glasses” operation, which enable intuitive graphical derivations of structural equivalences. By recasting traditionally algebraic equivariance proofs as visual diagrammatic transformations, the approach offers greater clarity and expressiveness. Furthermore, it treats attention masks as preprocessing operations, facilitating efficient and equivalent reformulations of sparse attention modules through graphical reasoning.
📝 Abstract
Architecture diagrams are ubiquitous in deep learning, but they are usually only representational: the tensor-program identities they suggest are still proved by prose and tensor-axis manipulation. We introduce a formal graphical calculus for the structural fragment of tensor programming underlying einops, making such diagrams proof-enabling. Our calculus represents tensor axes as nested graded tubes around a base type. The tube boundary recovers the undirected tensor-network view of axes, while the directed interior retains the operational reading of computation graphs. The key rewrite is grade-naturality: sliding spectacles over tubes. Standard equivariance proofs become short diagrammatic derivations. We additionally demonstrate how our rewrite system may be applied to convert attention masks into pre-processing operations, recovering efficient implementations of sparse attention blocks.