🤖 AI Summary
Existing large language models struggle to effectively model structured longitudinal electronic health records (EHRs), while EHR foundation models, though capable of learning patient representations, lack interpretable language-based reasoning. To address this gap, this work proposes ChatHealthAI, the first approach to achieve semantic alignment between an EHR foundation model and a frozen large language model. It introduces a task-aware resampler that maps structured patient representations into the language model’s semantic space and integrates fine-grained clinical event descriptions, thereby enabling natural language–driven clinical reasoning. Evaluated on three prediction tasks in the EHRSHOT benchmark, the method maintains competitive predictive performance while significantly enhancing the interpretability and clinical plausibility of its reasoning processes.
📝 Abstract
Large language models (LLMs) exhibit strong natural-language reasoning abilities for clinical decision support, but struggle to effectively model structured longitudinal electronic health records (EHRs). In contrast, EHR foundation models can learn predictive patient representations, yet lack interpretable language-based reasoning. To bridge this gap, we propose ChatHealthAI, a multimodal reasoning framework that aligns structured EHR representations from a pretrained EHR foundation model with the semantic space of a frozen LLM through a task-aware resampler. By integrating longitudinal patient representations with refined clinical event descriptions, ChatHealthAI enables clinically grounded natural-language reasoning while maintaining accurate patient prediction. We evaluated ChatHealthAI on three clinical predictive tasks from the EHRSHOT benchmark. Results show that ChatHealthAI improves reasoning quality and interpretability while preserving competitive predictive performance. These findings highlight the potential of integrating EHR foundation models with pretrained LLMs for interpretable clinical prediction.