Scalable Variational Inference for Multinomial Probit Models under Large Choice Sets and Sample Sizes

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional MCMC and maximum-likelihood estimation for high-dimensional multinomial probit models suffer from prohibitive computational costs and poor scalability. Method: We propose a scalable conditional variational inference framework that jointly models latent utility correlations via neural embeddings, enforces positive-definiteness of the covariance matrix through reparameterization, and replaces high-dimensional truncated Gaussian sampling with Gumbel-Softmax approximation for end-to-end differentiability. The framework incorporates variational autoencoder architecture and employs straight-through estimators to enhance gradient propagation efficiency. Contribution/Results: On a benchmark with 20 alternatives and one million observations, our method achieves parameter calibration in just 28 minutes—36× faster than state-of-the-art baselines—while significantly improving parameter recovery accuracy. This establishes a new, efficient, and accurate Bayesian inference paradigm for large-scale discrete choice modeling.

Technology Category

Application Category

📝 Abstract
The multinomial probit (MNP) model is widely used to analyze categorical outcomes due to its ability to capture flexible substitution patterns among alternatives. Conventional likelihood based and Markov chain Monte Carlo (MCMC) estimators become computationally prohibitive in high dimensional choice settings. This study introduces a fast and accurate conditional variational inference (CVI) approach to calibrate MNP model parameters, which is scalable to large samples and large choice sets. A flexible variational distribution on correlated latent utilities is defined using neural embeddings, and a reparameterization trick is used to ensure the positive definiteness of the resulting covariance matrix. The resulting CVI estimator is similar to a variational autoencoder, with the variational model being the encoder and the MNP's data generating process being the decoder. Straight through estimation and Gumbel SoftMax approximation are adopted for the argmax operation to select an alternative with the highest latent utility. This eliminates the need to sample from high dimensional truncated Gaussian distributions, significantly reducing computational costs as the number of alternatives grows. The proposed method achieves parameter recovery comparable to MCMC. It can calibrate MNP parameters with 20 alternatives and one million observations in approximately 28 minutes roughly 36 times faster and more accurate than the existing benchmarks in recovering model parameters.
Problem

Research questions and friction points this paper is trying to address.

Scalable inference for multinomial probit models with large datasets
Efficient parameter calibration without high-dimensional sampling
Accurate variational inference for large choice sets and samples
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditional variational inference for MNP models
Neural embeddings for variational distribution
Gumbel SoftMax for argmax operation
G
Gyeongjun Kim
Department of Urban Engineering, Chung Ang University, Korea.
Y
Yeseul Kang
Department of Urban Engineering, Chung Ang University, Korea.
Lucas Kock
Lucas Kock
National University of Singapore
Prateek Bansal
Prateek Bansal
National University of Singapore
Cognitive PsychologyEconometricsBayesian Machine LearningTravel BehaviourTransport Planning
K
Keemin Sohn
Department of Urban Engineering, Chung Ang University, Korea.; Department of Smart City, Chung Ang University, Korea.