🤖 AI Summary
To address the challenge of modeling complex set-theoretic relationships (e.g., “comedy AND action BUT NOT romance”) in sparse-data personalized recommendation, this paper proposes BoxRec—the first recommendation framework leveraging learnable hyper-rectangular (box) embeddings. Unlike conventional vector representations, BoxRec explicitly encodes user and item semantics as geometric boxes, formulating recommendation as a set-constrained matrix completion problem involving intersection, complement, and other set operations. It enables end-to-end training via interval arithmetic and Jaccard similarity optimization, replacing black-box neural inference with transparent, purely geometric operations—thereby ensuring semantic interpretability and structural fidelity. Experiments demonstrate that BoxRec significantly outperforms vector-based baselines across diverse recommendation tasks, achieving up to 30% improvement in accuracy for complex logical queries.
📝 Abstract
Personalized item recommendation typically suffers from data sparsity, which is most often addressed by learning vector representations of users and items via low-rank matrix factorization. While this effectively densifies the matrix by assuming users and movies can be represented by linearly dependent latent features, it does not capture more complicated interactions. For example, vector representations struggle with set-theoretic relationships, such as negation and intersection, e.g. recommending a movie that is"comedy and action, but not romance". In this work, we formulate the problem of personalized item recommendation as matrix completion where rows are set-theoretically dependent. To capture this set-theoretic dependence we represent each user and attribute by a hyper-rectangle or box (i.e. a Cartesian product of intervals). Box embeddings can intuitively be understood as trainable Venn diagrams, and thus not only inherently represent similarity (via the Jaccard index), but also naturally and faithfully support arbitrary set-theoretic relationships. Queries involving set-theoretic constraints can be efficiently computed directly on the embedding space by performing geometric operations on the representations. We empirically demonstrate the superiority of box embeddings over vector-based neural methods on both simple and complex item recommendation queries by up to 30 % overall.