π€ AI Summary
Existing VAE-based sequential recommendation models employ unimodal Gaussian priors, limiting their capacity to capture usersβ diverse, multimodal interests and thereby constraining recommendation performance. To address this, we propose SIGMAβthe first variational autoencoder framework incorporating a semantically aligned Gaussian mixture prior, explicitly modeling multiple semantically coherent user interest components. Methodologically, SIGMA: (1) introduces a probabilistic multi-interest extraction module; (2) formulates a multi-interest-aware evidence lower bound (ELBO) objective to enable interest disentanglement and interpretable representation learning; and (3) leverages implicit hyper-category guidance to enforce semantic alignment across interest components. Extensive experiments on multiple public benchmarks demonstrate that SIGMA consistently outperforms state-of-the-art VAE-based methods, achieving average improvements of 3.2β5.8% in AUC and Recall. The code is publicly available.
π Abstract
Variational AutoEncoder (VAE) for Sequential Recommendation (SR), which learns a continuous distribution for each user-item interaction sequence rather than a determinate embedding, is robust against data deficiency and achieves significant performance. However, existing VAE-based SR models assume a unimodal Gaussian distribution as the prior distribution of sequence representations, leading to restricted capability to capture complex user interests and limiting recommendation performance when users have more than one interest. Due to that it is common for users to have multiple disparate interests, we argue that it is more reasonable to establish a multimodal prior distribution in SR scenarios instead of a unimodal one. Therefore, in this paper, we propose a novel VAE-based SR model named SIGMA. SIGMA assumes that the prior of sequence representation conforms to a Gaussian mixture distribution, where each component of the distribution semantically corresponds to one of multiple interests. For multi-interest elicitation, SIGMA includes a probabilistic multi-interest extraction module that learns a unimodal Gaussian distribution for each interest according to implicit item hyper-categories. Additionally, to incorporate the multimodal interests into sequence representation learning, SIGMA constructs a multi-interest-aware ELBO, which is compatible with the Gaussian mixture prior. Extensive experiments on public datasets demonstrate the effectiveness of SIGMA. The code is available at https://github.com/libeibei95/SIGMA.