Digitally enriching a screening population for pancreatic cancer using routine blood-based measures and clinical histories

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
Pancreatic cancer lacks effective early screening methods, and most patients are diagnosed too late for curative intervention. This study addresses this challenge by proposing a customizable Transformer model trained on longitudinal electronic health records and blood test data, designed to be transferable across diverse healthcare settings. Leveraging multi-head attention mechanisms, the model predicts individual risk of pancreatic cancer onset within 1–3 years and incorporates Bayesian priors to dynamically calibrate population-level prevalence for targeted screening. In external validation, the model achieved strong performance with AUCs of 0.837, 0.797, and 0.760 for 1-, 2-, and 3-year predictions, respectively, demonstrating excellent calibration (slope = 1.08; Brier score = 0.025). At a risk threshold of >3.3%, it yielded a diagnostic odds ratio of 18.2, offering a viable pathway toward large-scale precision screening.
📝 Abstract
Earlier detection of pancreatic cancer is key to enabling wider access to curative treatment and reducing cancer deaths; however, screening is presently not viable. Latent indicators of pathology are evident in an individual's disease and blood test trajectories and may predict the development of pancreatic cancer. Longitudinal sequences of coded diagnoses and blood test values accrued by patients throughout their clinical interactions were used to train a custom Transformer-based neural network with a multi-head attention mechanism to predict risk of pancreatic cancer with a multi-year lead time and risk-stratify populations for targeted screening. The cohort comprised 6,017 adults with pancreatic cancer and 177,081 controls (overall median age 75, 45% female) with median 12 years (interquartile range 6.9-16.2) of medical history prior to pancreatic cancer diagnosis. External validation via leave-one-site-out, out-of-sample testing predicting pancreatic cancer 1-, 2-, and 3-years prior to diagnosis demonstrated mean area under the receiver operating characteristic of 0.837 (95% confidence interval 0.827-0.848), 0.797 (95% confidence interval 0.782-0.813), and 0.760 (95% confidence interval 0.745-0.776), respectively. Estimated pancreatic cancer risks were well-calibrated (calibration plot slope 1.08, intercept of -0.077; Brier score 0.025), and a Bayesian population pancreatic cancer prevalence update allows estimated cancer risk outputs to be transportable across settings. At testing, a screening threshold of >3.3% risk of pancreatic cancer in 1-year offered a diagnostic odds ratio of 18.2. Our work therefore lays the foundation for a first population-level digital enrichment tool to widen access to curative-intent management of pancreatic cancer.
Problem

Research questions and friction points this paper is trying to address.

pancreatic cancer
early detection
screening
risk stratification
digital enrichment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based neural network
longitudinal clinical data
pancreatic cancer early detection
digital enrichment
risk stratification
C
Chris Varghese
Department of Surgery, Mayo Clinic, Rochester, MN, USA
L
Leo Y. Li-Han
Department of Surgery, Mayo Clinic, Rochester, MN, USA
R
Richa Bisht
Department of Surgery, Mayo Clinic, Rochester, MN, USA
E
Ellen Larson
Department of Surgery, Mayo Clinic, Rochester, MN, USA
F
Frank Lee
Department of Surgery, Mayo Clinic, Rochester, MN, USA
R
Ryan M. Carr
Department of Surgery, Mayo Clinic, Rochester, MN, USA
T
Tanios S. Bekaii-Saab
Department of Hematology and Oncology, Mayo Clinic, Phoenix, AZ, USA
S
Shounak Majumder
Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
J
John D. Halamka
Mayo Clinic Platform, Mayo Clinic, Rochester, Minnesota
M
Mark Truty
Department of Surgery, Mayo Clinic, Rochester, MN, USA
A
Ajit H. Goenka
Department of Radiology, Mayo Clinic, Rochester, Minnesota
Hojjat Salehinejad
Hojjat Salehinejad
Mayo Clinic | University of Toronto
Machine LearningStatistical Signal ProcessingWireless SensingAI in Healthcare
C
Cornelius A. Thiels
Department of Surgery, Mayo Clinic, Rochester, MN, USA