Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This study addresses the limitations of traditional survival tree models, which often suffer from structural complexity, poor interpretability, and susceptibility to local optima due to greedy splitting strategies, thereby struggling to balance predictive performance and explainability. To overcome these issues, this work proposes a novel multi-objective co-evolutionary framework that, for the first time, integrates genetic programming into survival analysis. The framework simultaneously optimizes high-order nonlinear feature interactions and shallow tree structures, while investigating their synergistic mechanisms with various tree-construction strategies. Experimental results on two real-world datasets demonstrate that the proposed method consistently enhances the predictive accuracy of multiple tree-based models across different depths, efficiently yielding shallow survival trees that achieve both high accuracy and intrinsic interpretability—thus transcending the constraints inherent in conventional greedy approaches.

📝 Abstract

Survival analysis concerns the task of predicting the time until an event occurs. Often used in the medical field, survival analysis deals with incomplete (i.e., censored) data, for instance, from patients who did not experience the event during the duration of the study. For practical use, both accuracy and interpretability are important. Survival trees are easy-to-follow survival models that split the patient cohort recursively into discrete patient groups. Whilst survival trees can capture complex relationships, they typically need to grow large, threatening interpretability. Moreover, survival trees are often built using greedy approaches that may overlook globally optimal split combinations, limiting predictive performance. Shallow survival trees require expressive, higher-order feature combinations to achieve competitive accuracy. We therefore use genetic programming to multi-objectively evolve inherently inspectable feature sets and study how they interact with different tree induction strategies. We further introduce an evolutionary approach that jointly optimises the survival tree structure and the non-linear split logic. Our findings demonstrate that evolutionary feature construction improves predictive performance across different tree induction strategies on two real-world datasets and two different survival tree depths. Full joint evolution has the overall highest potential to propose multiple inherently inspectable shallow survival trees of good performance.

Problem

Research questions and friction points this paper is trying to address.

survival analysis

interpretability

survival trees

feature construction

predictive performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Genetic Programming

Survival Analysis

Interpretable Machine Learning