Incorporating data drift to perform survival analysis on credit risk

๐Ÿ“… 2026-01-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study addresses the limitations of traditional credit risk survival models, which assume data stationarity and thus struggle to adapt to distributional shifts caused by changes in borrower behavior, macroeconomic conditions, or regulatory policies. To overcome this, the authors propose the first dynamic joint framework that explicitly models data drift mechanisms by integrating longitudinal account balance trajectories with a discrete-time hazard model. The approach unifies the handling of abrupt, gradual, and periodic drifts through landmark one-hot encoding and isotonic recalibration. Empirical evaluation on Freddie Mac mortgage data demonstrates that the proposed method consistently outperforms classical survival models, tree-based adaptive techniques, and gradient boosting algorithms across diverse drift scenarios, achieving superior discrimination and calibration performance.

Technology Category

Application Category

๐Ÿ“ Abstract
Survival analysis has become a standard approach for modelling time to default by time-varying covariates in credit risk. Unlike most existing methods that implicitly assume a stationary data-generating process, in practise, mortgage portfolios are exposed to various forms of data drift caused by changing borrower behaviour, macroeconomic conditions, policy regimes and so on. This study investigates the impact of data drift on survival-based credit risk models and proposes a dynamic joint modelling framework to improve robustness under non-stationary environments. The proposed model integrates a longitudinal behavioural marker derived from balance dynamics with a discrete-time hazard formulation, combined with landmark one-hot encoding and isotonic calibration. Three types of data drift (sudden, incremental and recurring) are simulated and analysed on mortgage loan datasets from Freddie Mac. Experiments and corresponding evidence show that the proposed landmark-based joint model consistently outperforms classical survival models, tree-based drift-adaptive learners and gradient boosting methods in terms of discrimination and calibration across all drift scenarios, which confirms the superiority of our model design.
Problem

Research questions and friction points this paper is trying to address.

data drift
survival analysis
credit risk
non-stationary environment
model robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

data drift
survival analysis
dynamic joint modelling
landmark encoding
credit risk
๐Ÿ”Ž Similar Papers
No similar papers found.
J
Jianwei Peng
Humboldt-Universitรคt zu Berlin, School of Business and Economics, Spandauer Str. 1, 10178 Berlin
Stefan Lessmann
Stefan Lessmann
Professor of Information Systems, Humboldt-University of Berlin
Machine Learning & AICredit ScoringMarketing AnalyticsNLPxAI