Symmetry-Aware Graph Metanetwork Autoencoders: Model Merging through Parameter Canonicalization

📅 2025-11-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Neural network parameter spaces exhibit permutation and scaling symmetries, leading to multiple equivalent minima in the loss landscape and causing linear interpolation between models to traverse high-loss regions. To address this, we propose ScaleGMN—a symmetry-aware graph matching network—integrated into an autoencoder framework that jointly models both symmetries without solving combinatorial assignment problems. ScaleGMN standardizes parameters by mapping distinct networks into the same loss basin, enabling symmetry-invariant parameter alignment. The method supports both implicit neural representations (INRs) and convolutional neural networks (CNNs), preserving expressive capacity while significantly improving interpolation smoothness and fusion stability. Experiments demonstrate lower interpolation path loss and superior ensemble performance compared to baselines. ScaleGMN establishes a scalable, hyperparameter-free paradigm for symmetry normalization, advancing model merging and interpolation in deep learning.

Technology Category

Application Category

📝 Abstract

Neural network parameterizations exhibit inherent symmetries that yield multiple equivalent minima within the loss landscape. Scale Graph Metanetworks (ScaleGMNs) explicitly leverage these symmetries by proposing an architecture equivariant to both permutation and parameter scaling transformations. Previous work by Ainsworth et al. (2023) addressed permutation symmetries through a computationally intensive combinatorial assignment problem, demonstrating that leveraging permutation symmetries alone can map networks into a shared loss basin. In this work, we extend their approach by also incorporating scaling symmetries, presenting an autoencoder framework utilizing ScaleGMNs as invariant encoders. Experimental results demonstrate that our method aligns Implicit Neural Representations (INRs) and Convolutional Neural Networks (CNNs) under both permutation and scaling symmetries without explicitly solving the assignment problem. This approach ensures that similar networks naturally converge within the same basin, facilitating model merging, i.e., smooth linear interpolation while avoiding regions of high loss. The code is publicly available on our GitHub repository.

Problem

Research questions and friction points this paper is trying to address.

Model merging addresses neural network parameter symmetries for alignment

Autoencoder framework incorporates permutation and scaling symmetries automatically

Enables smooth interpolation between models without solving assignment problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder framework using ScaleGMNs as invariant encoders

Incorporates both permutation and scaling symmetries

Aligns networks without solving assignment problem

🔎 Similar Papers

No similar papers found.

Authors to Follow