Kantorovich Regression Analysis of Random Distributions with Mixed Predictors

πŸ“… 2026-03-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes an interpretable regression framework based on optimal transport for settings where the response variable is a probability distribution and predictors comprise a mixture of distributional and Euclidean types. By constructing Kantorovich potentials via Wasserstein barycenters, the displacement field of the response distribution is expressed as a weighted sum of displacement fields induced by individual predictors, with Euclidean variables acting as scaling coefficients of c-concave potential functions, thereby yielding a linear structural model. This approach is the first to integrate Kantorovich potentials with mixed-type predictors, establishing an intrinsically consistent regression model for distributional responses. The authors provide theoretical characterizations of the functional parameters and develop a first-order scalable optimization algorithm. Empirical validation on housing price and two-dimensional temperature distribution data demonstrates the model’s flexibility and interpretability, while consistency and asymptotic theory for the empirical Wasserstein loss are also established.

Technology Category

Application Category

πŸ“ Abstract
We study regression problems with distribution-valued responses and mixed distributional and Euclidean predictors. In quadratic cost, the negative gradient of the Kantorovich potential represents, at each source location, the displacement to its matched location under the optimal transport map. By constructing potentials from the Wasserstein barycenter to individual distributions, the proposed Kantorovich regression model approximates the response displacement field as a sum of predictor displacement fields, each adjusted by a functional parameter. Owing to the linear structure, Euclidean predictors can enter as scaling coefficients of $c$-concave parameter potentials. We characterize functional parameter classes ensuring intrinsicness of the model, establish asymptotic theory through uniform convergence of the empirical Wasserstein loss, and derive G\^ateaux derivatives leading to first-order optimization algorithms. Real data applications include a mixed-predictor analysis of housing price distributions and an analysis of two-dimensional temperature distributions, demonstrating the flexibility and interpretability of the proposed framework.
Problem

Research questions and friction points this paper is trying to address.

distribution-valued response
mixed predictors
regression analysis
optimal transport
Wasserstein barycenter
Innovation

Methods, ideas, or system contributions that make the work stand out.

Kantorovich regression
Wasserstein barycenter
distribution-valued response
optimal transport
functional parameter
πŸ”Ž Similar Papers
No similar papers found.