Scalable Variational Inference for Multinomial Probit Models under Large Choice Sets and Sample Sizes

📅 2025-07-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Traditional MCMC and maximum-likelihood estimation for high-dimensional multinomial probit models suffer from prohibitive computational costs and poor scalability. Method: We propose a scalable conditional variational inference framework that jointly models latent utility correlations via neural embeddings, enforces positive-definiteness of the covariance matrix through reparameterization, and replaces high-dimensional truncated Gaussian sampling with Gumbel-Softmax approximation for end-to-end differentiability. The framework incorporates variational autoencoder architecture and employs straight-through estimators to enhance gradient propagation efficiency. Contribution/Results: On a benchmark with 20 alternatives and one million observations, our method achieves parameter calibration in just 28 minutes—36× faster than state-of-the-art baselines—while significantly improving parameter recovery accuracy. This establishes a new, efficient, and accurate Bayesian inference paradigm for large-scale discrete choice modeling.

Technology Category

Application Category

📝 Abstract

The multinomial probit (MNP) model is widely used to analyze categorical outcomes due to its ability to capture flexible substitution patterns among alternatives. Conventional likelihood based and Markov chain Monte Carlo (MCMC) estimators become computationally prohibitive in high dimensional choice settings. This study introduces a fast and accurate conditional variational inference (CVI) approach to calibrate MNP model parameters, which is scalable to large samples and large choice sets. A flexible variational distribution on correlated latent utilities is defined using neural embeddings, and a reparameterization trick is used to ensure the positive definiteness of the resulting covariance matrix. The resulting CVI estimator is similar to a variational autoencoder, with the variational model being the encoder and the MNP's data generating process being the decoder. Straight through estimation and Gumbel SoftMax approximation are adopted for the argmax operation to select an alternative with the highest latent utility. This eliminates the need to sample from high dimensional truncated Gaussian distributions, significantly reducing computational costs as the number of alternatives grows. The proposed method achieves parameter recovery comparable to MCMC. It can calibrate MNP parameters with 20 alternatives and one million observations in approximately 28 minutes roughly 36 times faster and more accurate than the existing benchmarks in recovering model parameters.

Problem

Research questions and friction points this paper is trying to address.

Scalable inference for multinomial probit models with large datasets

Efficient parameter calibration without high-dimensional sampling

Accurate variational inference for large choice sets and samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditional variational inference for MNP models

Neural embeddings for variational distribution

Gumbel SoftMax for argmax operation

🔎 Similar Papers

Identifiable latent bandits: Combining observational data and exploration for personalized healthcare

2024-07-23arXiv.orgCitations: 0

Authors to Follow