Adaptive Partitioning and Learning for Stochastic Control of Diffusion Processes

📅 2025-12-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper addresses reinforcement learning for controlled diffusion processes with unbounded continuous state spaces, bounded continuous action spaces, and polynomially growing rewards—a setting prevalent in finance and operations optimization. We propose the first model-based adaptive joint state-action discretization algorithm designed for unbounded domains. Our method introduces the novel concept of “scaling dimension” and a statistical confidence-interval-driven dynamic discretization mechanism, enabling asymptotic refinement under an optimal bias-variance trade-off. We establish the first provably tight regret bound for unbounded diffusion control, explicitly depending on time horizon, state dimension, reward growth order, and scaling dimension—naturally subsuming the bounded-domain case. Empirical evaluation on high-dimensional financial tasks, including multi-asset mean-variance portfolio optimization, demonstrates both efficiency and robustness of the proposed approach.

Technology Category

Application Category

📝 Abstract

We study reinforcement learning for controlled diffusion processes with unbounded continuous state spaces, bounded continuous actions, and polynomially growing rewards: settings that arise naturally in finance, economics, and operations research. To overcome the challenges of continuous and high-dimensional domains, we introduce a model-based algorithm that adaptively partitions the joint state-action space. The algorithm maintains estimators of drift, volatility, and rewards within each partition, refining the discretization whenever estimation bias exceeds statistical confidence. This adaptive scheme balances exploration and approximation, enabling efficient learning in unbounded domains. Our analysis establishes regret bounds that depend on the problem horizon, state dimension, reward growth order, and a newly defined notion of zooming dimension tailored to unbounded diffusion processes. The bounds recover existing results for bounded settings as a special case, while extending theoretical guarantees to a broader class of diffusion-type problems. Finally, we validate the effectiveness of our approach through numerical experiments, including applications to high-dimensional problems such as multi-asset mean-variance portfolio selection.

Problem

Research questions and friction points this paper is trying to address.

Develops adaptive partitioning for unbounded diffusion process control.

Balances exploration and approximation in high-dimensional continuous spaces.

Establishes regret bounds for diffusion-type problems with unbounded domains.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptively partitions state-action space for unbounded domains

Maintains drift, volatility, reward estimators within partitions

Balances exploration and approximation via adaptive refinement

🔎 Similar Papers

No similar papers found.

Authors to Follow