Scalable Approximate Algorithms for Optimal Transport Linear Models

📅 2025-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses generalized nonnegative linear regression with an entropically regularized optimal transport (OT) data-fitting term, incorporating arbitrary convex regularization on weights and composite convex losses between marginal distributions—applicable to spectral unmixing, music transcription, and related tasks. Methodologically, we propose the first unified, scalable approximate solver: a Sinkhorn-type iterative framework specifically designed for general convex structures, yielding concise, theoretically rigorous multiplicative update rules. The algorithm supports efficient parallel implementation, ensures convergence, and scales to large problems. Experimentally, our method significantly outperforms existing OT-based regression approaches in both accuracy and runtime, while markedly improving model generalization and practical applicability.

Technology Category

Application Category

📝 Abstract
Recently, linear regression models incorporating an optimal transport (OT) loss have been explored for applications such as supervised unmixing of spectra, music transcription, and mass spectrometry. However, these task-specific approaches often do not generalize readily to a broader class of linear models. In this work, we propose a novel algorithmic framework for solving a general class of non-negative linear regression models with an entropy-regularized OT datafit term, based on Sinkhorn-like scaling iterations. Our framework accommodates convex penalty functions on the weights (e.g. squared-$ell_2$ and $ell_1$ norms), and admits additional convex loss terms between the transported marginal and target distribution (e.g. squared error or total variation). We derive simple multiplicative updates for common penalty and datafit terms. This method is suitable for large-scale problems due to its simplicity of implementation and straightforward parallelization.
Problem

Research questions and friction points this paper is trying to address.

Generalizing linear regression with optimal transport loss
Solving non-negative linear regression with OT datafit
Enabling large-scale problems via scalable algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sinkhorn-like scaling for entropy-regularized OT
Convex penalty functions on model weights
Multiplicative updates for large-scale efficiency
🔎 Similar Papers
No similar papers found.
Tomasz Kacprzak
Tomasz Kacprzak
Swiss Data Science Center, Paul Scherrer Institute, ETH Zurich
Data ScienceCosmology
Francois Kamper
Francois Kamper
Senior Data Scientist Swiss Data Science Center EPFL
Graphical modelscausalityconvex optimisationtime seriesextreme value theory
M
Michael W. Heiss
PSI Center for Neutron and Muon Sciences CNM, 5232 Villigen PSI, Switzerland
G
Gianluca Janka
PSI Center for Neutron and Muon Sciences CNM, 5232 Villigen PSI, Switzerland
A
Ann M. Dillner
Air Quality Research Center, University of California, Davis, CA 95618, USA
Satoshi Takahama
Satoshi Takahama
EPFL
aerosol scienceaerosol technologyatmospheric chemistrymachine learningspectroscopy