Constructing VAE Latent Spaces with Prescribed Topology

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Standard variational autoencoders employ Gaussian priors, which struggle to align with data manifolds exhibiting non-Euclidean topologies—such as periodicity or boundedness—leading to distorted representations. This work proposes a topology-aware latent space modeling framework that constructs factorized prior distributions tailored to manifolds decomposable into products of circles, intervals, and lines, along with their finite group quotients. This design enables disentangled latent representations and analytically tractable KL divergences. By integrating differentiable coordinate transformations, group-invariant decoding, and anchor-point constraints, the approach ensures smooth gradients and topological consistency. To our knowledge, this is the first method to systematically align latent variable distributions with the intrinsic topology of data manifolds, supporting reparameterizable encoder–prior pairs and significantly outperforming Gaussian-prior baselines on synthetic manifolds as well as rotation- and cyclic-translation variants of MNIST.

📝 Abstract

Variational autoencoders (VAEs) learn low-dimensional latent representations of high-dimensional data. When the data lies on a manifold with non-Euclidean topology, the standard Gaussian prior introduces a topological mismatch that degrades reconstruction quality and prevents faithful representation. We present a constructive mathematical framework that resolves this mismatch for all manifolds that admit a product covering space. These are manifolds expressible as products of elementary factors (circles, intervals, or lines) or as quotients of such products by a finite symmetry group. The class includes cylinders, tori, Möbius strips, Klein bottles, and real projective spaces. Factorized distributions over the elementary factors yield product topologies with closed-form, decoupled KL divergences, so that each latent factor can be shaped independently while keeping training tractable. We catalogue reparametrizable encoder-prior pairs for periodic, bounded, and unbounded supports, and provide coordinate transformations that allow standard neural networks to output non-Euclidean parameters with smooth gradients. For quotient manifolds, the decoder receives group-invariant features of the covering-space coordinates, so that identified points produce identical outputs. Anchor constraints fix the coordinate system relative to the data or create soft topological holes. Experiments on synthetic manifolds and real-image datasets (rotated and cyclically shifted MNIST) confirm that a topology-matched prior aligns KL regularization with the data manifold. The resulting topology-aware models outperform the Gaussian baseline at all practically relevant regularization strengths. The code is available at https://github.com/JvHulst/VAE-Topology.

Problem

Research questions and friction points this paper is trying to address.

topological mismatch

non-Euclidean topology

variational autoencoders

latent space

manifold representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

topology-aware VAE

non-Euclidean latent space

product covering space