🤖 AI Summary
This paper addresses the cumulative regret incurred by a market maker in the ergodic Avellaneda–Stoikov model due to learning the price sensitivity parameter κ of liquidity takers. While conventional approaches achieve only an O(√T) regret bound, we establish, for the first time, a tight upper bound on the derivative of the ergodic constant—arising in the ergodic Hamilton–Jacobi–Bellman (HJB) equation—with respect to κ. Leveraging this bound, together with regularized maximum likelihood estimation and a concentration inequality for Bernoulli signals, we rigorously prove an O(ln²T) upper bound on expected regret, breaking the √T barrier. Our theoretical analysis integrates ergodic control theory and stochastic approximation. Numerical experiments confirm rapid convergence and robustness to parameter misspecification. This work provides the first logarithmic-regret guarantee for online learning in high-frequency market making.
📝 Abstract
We analyse the regret arising from learning the price sensitivity parameter $kappa$ of liquidity takers in the ergodic version of the Avellaneda-Stoikov market making model. We show that a learning algorithm based on a regularised maximum-likelihood estimator for the parameter achieves the regret upper bound of order $ln^2 T$ in expectation. To obtain the result we need two key ingredients. The first are tight upper bounds on the derivative of the ergodic constant in the Hamilton-Jacobi-Bellman (HJB) equation with respect to $kappa$. The second is the learning rate of the maximum-likelihood estimator which is obtained from concentration inequalities for Bernoulli signals. Numerical experiment confirms the convergence and the robustness of the proposed algorithm.