Back to Explorer
Research PaperResearchia:202603.13028[Mathematics > Mathematics]

Operator Splitting, Policy Iteration, and Machine Learning for Stochastic Optimal Control

Alain Bensoussan

Abstract

We propose a splitting approach to solve the second-order Hamilton--Jacobi equation, reducing it to a heat step and a purely first-order step. The latter is implemented using a gradient value policy iteration algorithm, enabling efficient characteristic-based machine learning methods. We establish convergence rates for the splitting method. In particular, the L∞L^\infty error is bounded below by O(h)\mathcal{O}(h) and above by O(h1/7)\mathcal{O}(h^{1/7}) for Lipschitz initial data; this improves to O(h1/5)\mathcal{O}(h^{1/5}) for semiconcave data and to O(h1/3)\mathcal{O}(h^{1/3}) for C2C^2 data. We also prove an upper L1L^1 error estimate of order O(h1/2)\mathcal{O}(h^{1/2}) in the periodic setting, where hh is the splitting step. For the first-order step, we provide a weighted L2L^2 error analysis that shows exponential convergence. Each iteration solves linear characteristic equations and learns the value function by minimizing a weighted value gradient loss. The approach yields stable and accurate numerical results.


Source: arXiv:2603.12167v1 - http://arxiv.org/abs/2603.12167v1 PDF: https://arxiv.org/pdf/2603.12167v1 Original Link: http://arxiv.org/abs/2603.12167v1

Submission:3/13/2026
Comments:0 comments
Subjects:Mathematics; Mathematics
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Operator Splitting, Policy Iteration, and Machine Learning for Stochastic Optimal Control | Researchia