🤖 AI Summary
Transformer embedding spaces exhibit severe anisotropy—information points concentrate locally, degrading representations and limiting downstream performance. To address this, we propose the first geometric regularization method grounded in persistent homology: we construct a simplicial complex over the embedding space via the Vietoris–Rips filtration and explicitly enhance isotropy by maximizing the persistent entropy of its barcode. This approach requires no architectural modification, incurs zero inference overhead, and avoids reparameterization. Experiments demonstrate that our regularization significantly reduces anisotropy metrics, consistently improves performance across diverse downstream tasks, accelerates fine-tuning convergence, and enhances generalization. Our core contribution lies in introducing, for the first time, persistent-entropy-driven topological geometry into embedding space optimization—establishing an interpretable, non-intrusive geometric control paradigm for deep representation learning.
📝 Abstract
Although transformer-based models have been dominating the field of deep learning, various studies of their embedding space have shown that they suffer from"representation degeneration problem": embeddings tend to be distributed in a narrow cone, making the latent space highly anisotropic. Increasing the isotropy has shown to improve performance in downstream tasks both in static and contextual language models. However, most of approaches either add inference overhead or require substantial amount of data for model reparametrization. We propose a novel regularization technique based on simplicial geometry to improve the isotropy of latent representations. The core idea of our method is based on maximizing the persistent entropy of barcodes obtained using Vietoris-Rips filtration from contextual embeddings in the underlying latent space. We demonstrate that the method leads to an increase in downstream performance while significantly lowering the anisotropy during fine-tuning by exploiting existing geometric structures instead of reparametrization.