Reinforcement Learning for Micro-Level Claims Reserving

📅 2026-01-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses a key limitation in traditional reserve models, which typically rely on settled claims for one-time predictions and neglect the dynamic evolution of open claims, often resulting in sample loss and selection bias. To overcome this, the paper proposes a novel approach that formulates individual claim reserving as a Markov decision process and employs a reinforcement learning agent to continuously optimize reserve estimates throughout the entire claim lifecycle. Built upon the Soft Actor-Critic algorithm, the method leverages a continuous action space, rolling settlement-based hyperparameter tuning, and importance weighting to balance predictive accuracy with revision stability. Experiments on the CAS and SPLICE synthetic datasets demonstrate that the proposed framework achieves competitive performance in both individual claim prediction and aggregate reserve estimation, with particularly strong results in the early, immature stages of claims.

Technology Category

Application Category

📝 Abstract
Outstanding claim liabilities are revised repeatedly as claims develop, yet most modern reserving models are trained as one-shot predictors and typically learn only from settled claims. We formulate individual claims reserving as a claim-level Markov decision process in which an agent sequentially updates outstanding claim liability (OCL) estimates over development, using continuous actions and a reward design that balances accuracy with stable reserve revisions. A key advantage of this reinforcement learning (RL) approach is that it can learn from all observed claim trajectories, including claims that remain open at valuation, thereby avoiding the reduced sample size and selection effects inherent in supervised methods trained on ultimate outcomes only. We also introduce practical components needed for actuarial use -- initialisation of new claims, temporally consistent tuning via a rolling-settlement scheme, and an importance-weighting mechanism to mitigate portfolio-level underestimation driven by the rarity of large claims. On CAS and SPLICE synthetic general insurance datasets, the proposed Soft Actor-Critic implementation delivers competitive claim-level accuracy and strong aggregate OCL performance, particularly for the immature claim segments that drive most of the liability.
Problem

Research questions and friction points this paper is trying to address.

claims reserving
outstanding claim liabilities
reinforcement learning
Markov decision process
actuarial prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning
Claims Reserving
Markov Decision Process
Soft Actor-Critic
Importance Weighting
🔎 Similar Papers
No similar papers found.
Benjamin Avanzi
Benjamin Avanzi
Professor of Actuarial Studies, University of Melbourne
Actuarial ScienceRisk TheoryDependence modellingPensionsRisk Modelling in Operations Management
R
Ronald Richman
insureAI and University of the Witwatersrand
Bernard Wong
Bernard Wong
University of New South Wales
Actuarial Science
M
Mario V. Wuthrich
Department of Mathematics, ETH Zurich
Y
Yagebu Xie
Centre for Actuarial Studies, Department of Economics, The University of Melbourne