Entropy-regularized penalization schemes for American options and reflected BSDEs with singular generators

📅 2026-02-20
📈 Citations: 1
Influential: 1
📄 PDF
🤖 AI Summary
This work addresses the singularity challenges arising in the continuous-time optimal stopping problem for American option pricing and its associated reflected backward stochastic differential equation (RBSDE). To overcome these difficulties, the authors propose an entropy-regularized penalty method that yields a smooth approximation of the optimal stopping problem, thereby enabling gradient-based optimization and enhancing policy exploration. By analyzing the asymptotic behavior as the penalty parameter tends to infinity, they establish—for the first time—a rigorous connection between this regularization scheme and a novel class of RBSDEs featuring logarithmic singularities in their generators. Theoretically, they prove well-posedness and convergence of the regularized problem and establish existence and uniqueness of solutions for this class of singular RBSDEs. Numerically, feasibility is demonstrated through a combination of policy iteration, least-squares Monte Carlo, and monotone limit arguments.

Technology Category

Application Category

📝 Abstract
This paper extends our previous work in Chee et al. [9] to continuous-time optimal stopping problems, with a particular focus on American options within an exploratory framework. We pursue two main objectives. First, motivated by reinforcement learning applications, we introduce an entropy-regularized penalization scheme for continuous-time optimal stopping problems. The scheme is inspired by classical penalization techniques for reflected backward stochastic differential equations (RBSDEs) and provides a smooth approximation of the degenerate stopping rule inherent to the American option problem. This regularization promotes exploration, enables the use of gradient-based optimization methods, and leads naturally to policy improvement algorithms. We establish well-posedness and convergence properties of the scheme, and illustrate its numerical feasibility through low-dimensional experiments based on policy iteration and least-squares Monte Carlo methods. Second, from a theoretical perspective, we study the asymptotic limit of the entropy-regularized penalization as the penalization parameter tends to infinity. We show that the limiting value process solves a reflected BSDE with a logarithmically singular driver, and we prove existence and uniqueness of solutions to this new class of RBSDEs via a monotone limit argument. To the best of our knowledge, such equations have not previously been investigated in the literature
Problem

Research questions and friction points this paper is trying to address.

American options
optimal stopping
reflected BSDEs
singular generators
entropy regularization
Innovation

Methods, ideas, or system contributions that make the work stand out.

entropy regularization
American options
reflected BSDEs
singular generators
optimal stopping