Covariance-Aware Transformers for Quadratic Programming and Decision Making

๐Ÿ“… 2026-02-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes Time2Decide, a framework that explicitly embeds covariance matrices into temporal foundation models to efficiently solve covariance-aware quadratic programming (QP) problems in an end-to-end manner. By leveraging a linear attention mechanism to emulate gradient descent, the framework solves unconstrained QP problems; it further extends to โ„“โ‚-regularized and constrained variants through a combination of multilayer perceptrons and feedback loops. Notably, this study provides the first theoretical proof that Transformers can solve quadratic programs via their attention mechanisms. Evaluated on temporal decision-making tasks such as portfolio optimization, Time2Decide significantly outperforms baseline temporal models and, in suitable settings, surpasses the conventional โ€œpredict-then-optimizeโ€ paradigm.

Technology Category

Application Category

๐Ÿ“ Abstract
We explore the use of transformers for solving quadratic programs and how this capability benefits decision-making problems that involve covariance matrices. We first show that the linear attention mechanism can provably solve unconstrained QPs by tokenizing the matrix variables (e.g.~$A$ of the objective $\frac{1}{2}x^\top Ax+b^\top x$) row-by-row and emulating gradient descent iterations. Furthermore, by incorporating MLPs, a transformer block can solve (i) $\ell_1$-penalized QPs by emulating iterative soft-thresholding and (ii) $\ell_1$-constrained QPs when equipped with an additional feedback loop. Our theory motivates us to introduce Time2Decide: a generic method that enhances a time series foundation model (TSFM) by explicitly feeding the covariance matrix between the variates. We empirically find that Time2Decide uniformly outperforms the base TSFM model for the classical portfolio optimization problem that admits an $\ell_1$-constrained QP formulation. Remarkably, Time2Decide also outperforms the classical"Predict-then-Optimize (PtO)"procedure, where we first forecast the returns and then explicitly solve a constrained QP, in suitable settings. Our results demonstrate that transformers benefit from explicit use of second-order statistics, and this can enable them to effectively solve complex decision-making problems, like portfolio construction, in one forward pass.
Problem

Research questions and friction points this paper is trying to address.

quadratic programming
covariance matrix
decision making
transformers
portfolio optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Covariance-Aware Transformers
Quadratic Programming
Linear Attention
Time2Decide
Predict-then-Optimize
๐Ÿ”Ž Similar Papers
No similar papers found.