🤖 AI Summary
Few-shot font generation (FFG) is critical for constructing fonts in low-resource languages, yet existing methods suffer from poor generalization to unseen characters—especially under large inter-glyph style variations—due to severe content-style entanglement. To address this, we propose HAE-Font: (1) a Heterogeneous Aggregation Experts (HAE) module that jointly aggregates multi-scale features across both channel and spatial dimensions; (2) a content-style homogeneity loss that explicitly enforces disentanglement during training; and (3) a Mixture-of-Experts (MoE)-based architecture enabling collaborative expert modeling. Evaluated on multiple benchmarks, HAE-Font significantly improves fidelity for unseen characters and cross-font style consistency. It achieves state-of-the-art performance both qualitatively—demonstrating superior visual quality—and quantitatively—outperforming prior methods across standard metrics including SSIM, LPIPS, and character-level accuracy.
📝 Abstract
Few-shot Font Generation (FFG) aims to create new font libraries using limited reference glyphs, with crucial applications in digital accessibility and equity for low-resource languages, especially in multilingual artificial intelligence systems. Although existing methods have shown promising performance, transitioning to unseen characters in low-resource languages remains a significant challenge, especially when font glyphs vary considerably across training sets. MX-Font considers the content of a character from the perspective of a local component, employing a Mixture of Experts (MoE) approach to adaptively extract the component for better transition. However, the lack of a robust feature extractor prevents them from adequately decoupling content and style, leading to sub-optimal generation results. To alleviate these problems, we propose Heterogeneous Aggregation Experts (HAE), a powerful feature extraction expert that helps decouple content and style downstream from being able to aggregate information in channel and spatial dimensions. Additionally, we propose a novel content-style homogeneity loss to enhance the untangling. Extensive experiments on several datasets demonstrate that our MX-Font++ yields superior visual results in FFG and effectively outperforms state-of-the-art methods. Code and data are available at https://github.com/stephensun11/MXFontpp.