MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

📅 2026-02-10

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This work addresses the disconnect between existing large language model (LLM)-based clinical tabular feature engineering approaches and downstream predictive models, which struggle with class imbalance, heterogeneous features, and stringent interpretability requirements in healthcare data. To bridge this gap, the authors propose MedFeat, a novel framework that, for the first time, integrates model-awareness and interpretability-driven mechanisms into LLM-assisted feature engineering. MedFeat iteratively generates new features that align with the inductive biases of downstream models and are inherently interpretable by fusing feature importance signals with real-time model feedback. Evaluated across multiple real-world clinical prediction tasks, MedFeat significantly outperforms current methods, achieving an average performance improvement exceeding 10%, while maintaining compatibility with diverse model architectures.

📝 Abstract

In healthcare tabular predictions, classical models with feature engineering often outperform neural approaches. Recent advances in Large Language Models enable the integration of domain knowledge into feature engineering, offering a promising direction. However, existing approaches typically rely on a broad search over predefined transformations, overlooking downstream model characteristics and feature importance signals. We present MedFeat, a feedback-driven and model-aware feature engineering framework that leverages LLM reasoning with domain knowledge and provides feature explanations based on SHAP values while tracking successful and failed proposals to guide feature discovery. By incorporating model awareness, MedFeat prioritizes informative signals that are difficult for the downstream model to learn directly due to its characteristics. Across a broad range of clinical prediction tasks, MedFeat achieves stable improvements over various baselines and discovers clinically meaningful features that generalize under distribution shift, demonstrating robustness across years and from ICU cohorts to general hospitalized patients, thereby offering insights into real-world deployment. Code required to reproduce our experiments will be released, subject to dataset agreements and institutional policies.

Problem

Research questions and friction points this paper is trying to address.

clinical tabular prediction

feature engineering

model-awareness

explainability

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

model-aware feature engineering

explainability-driven

LLM-guided feature discovery