🤖 AI Summary
Existing tabular foundation models struggle to effectively model censored data in survival analysis, often leading to biased predictions. This work proposes SurvPFN—the first successful extension of Prior-data Fitted Networks (PFNs) to survival analysis—by pretraining on millions of synthetic survival tasks and framing the problem as continuous-time distribution regression that explicitly accounts for censoring. SurvPFN requires no task-specific architecture, feature engineering, or dataset-specific fine-tuning. It leverages Weibull-distributed event times, a non-informative censoring mechanism, and a novel censored negative log-likelihood loss. Evaluated on real-world datasets from SurvSet, SurvPFN matches or surpasses the performance of established classical and deep learning baselines in survival prediction.
📝 Abstract
Tabular foundation models (TFMs) have made rapid progress in standard classification and regression, but time-to-event survival prediction tasks have remained largely untouched. Unlike in standard regression tasks, survival prediction models must account for censored data. Standard TFMs cannot handle natively censored data, leading to biased and inaccurate predictions, making them unsuitable for real-world applications. To overcome this fundamental limitation, we propose \texttt{SurvPFN}, a prior-data fitted network (PFN), for survival prediction tasks. We pretrain \texttt{SurvPFN} on millions of synthetic survival prediction tasks to learn survival via distributional regression that accounts for censored data. \texttt{SurvPFN} works by (1) generating data with Weibull event times and a non-informative censoring mechanism; (2) integrating a censored event indicator; and (3) minimizing a censored negative log-likelihood. On SurvSet, a collection of real-world survival tasks, \texttt{SurvPFN} is highly competitive with classical and deep survival baselines without per-dataset fitting, a survival-specific architecture, or feature engineering. We show that survival can be treated as a continuous-time distributional regression problem with censored loss, unlocking the power of PFNs for time-to-event predictions.