Bayesian nonparametric Mallows model for clustering preference data

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the joint inference of a consensus ranking, individual preferences, and clustering structure from preference data without requiring a pre-specified number of clusters. To this end, it introduces Bayesian nonparametrics into the Mallows model for the first time, constructing an extended framework based on Dirichlet process mixtures that accommodates both incomplete rankings and pairwise comparison data. The proposed approach enables simultaneous posterior inference of the number of clusters and cluster assignments via Markov chain Monte Carlo (MCMC) sampling. Experimental results demonstrate that the method accurately recovers the true number of clusters in simulated data, outperforming finite mixture models, and significantly enhances personalized recommendation performance on real-world movie rating data, particularly excelling in predicting missing ratings.
📝 Abstract
Preference learning refers to the learning of latent patterns from ranking and preference data of different kinds. Typical aims of preference learning are to infer a shared consensus ranking, to learn individual-level preferences, and to perform unsupervised clustering. The Mallows model is among the few approaches that can achieve all these objectives jointly. Previous work has developed computationally tractable methods for Bayesian inference based on a MCMC Metropolis-Hastings scheme, where clustering is performed via a finite mixture of Mallows models. Inference on the number of clusters is then conducted a posteriori. Here we propose a Bayesian nonparametric Mallows model, based on a Dirichlet process mixture model. This allows joint inference on the number of non-empty clusters and on the clustering allocation, as well as posterior inference on cluster-specific parameters. The implementation of the proposed sampling algorithm is integrated into the existing R package BayesMallows, which also supports data in the form of incomplete rankings and pairwise comparisons. Simulated data show good performance of the nonparametric model compared to a finite mixture model in terms of recovery of the correct number of clusters, while empirical data on movie ratings show the model's effectiveness in providing personalized movie recommendations on discarded ratings.
Problem

Research questions and friction points this paper is trying to address.

preference learning
clustering
Mallows model
Bayesian nonparametrics
Dirichlet process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian nonparametrics
Mallows model
Dirichlet process mixture
preference learning
clustering
🔎 Similar Papers
No similar papers found.