🤖 AI Summary
Understanding the unsupervised specialization mechanisms of experts in sparse Mixture-of-Experts (MoE) models remains a key challenge in deep learning interpretability.
Method: We propose the Sparse Mixture-of-Experts Variational Autoencoder (SMoE-VAE), which replaces conventional supervised routing with an unsupervised, data-driven routing mechanism within a VAE framework. Evaluated on the QuickDraw dataset, SMoE-VAE achieves superior reconstruction performance.
Contribution/Results: t-SNE visualizations and reconstruction analysis demonstrate that experts spontaneously identify cross-category semantic substructures—revealing intrinsic, task-driven data organization that transcends human-defined class boundaries. Furthermore, ablation studies empirically validate a trade-off between dataset scale and expert specialization degree. Our work provides both theoretical insights and empirical evidence for designing efficient, interpretable MoE architectures grounded in unsupervised specialization principles.
📝 Abstract
Understanding the internal organization of neural networks remains a fundamental challenge in deep learning interpretability. We address this challenge by exploring a novel Sparse Mixture of Experts Variational Autoencoder (SMoE-VAE) architecture. We test our model on the QuickDraw dataset, comparing unsupervised expert routing against a supervised baseline guided by ground-truth labels. Surprisingly, we find that unsupervised routing consistently achieves superior reconstruction performance. The experts learn to identify meaningful sub-categorical structures that often transcend human-defined class boundaries. Through t-SNE visualizations and reconstruction analysis, we investigate how MoE models uncover fundamental data structures that are more aligned with the model's objective than predefined labels. Furthermore, our study on the impact of dataset size provides insights into the trade-offs between data quantity and expert specialization, offering guidance for designing efficient MoE architectures.