P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data

📅 2023-03-30
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing medical tabular models often neglect unstructured clinical text and fail to fully exploit semantic information embedded in structured fields, limiting their predictive performance in electronic health record (EHR) applications. To address this, we propose the first prompt-driven multimodal tabular Transformer framework. Our method introduces a medical-knowledge-guided table cell semantic alignment mechanism to jointly encode numerical values, categorical labels, and free-text entries. It employs a two-stage Transformer architecture integrating BioBERT pre-trained representations, learnable medical prompt templates, hierarchical cell embeddings, and table-level modeling. Evaluated on three clinical prediction tasks across two real-world EHR datasets, our model achieves state-of-the-art performance: up to 10.9% reduction in RMSE and 11.0% reduction in MAE; improvements of 1.6% in balanced accuracy (BACC) and 0.8% in AUROC—significantly outperforming existing approaches.
📝 Abstract
Medical tabular data, abundant in Electronic Health Records (EHRs), is a valuable resource for diverse medical tasks such as risk prediction. While deep learning approaches, particularly transformer-based models, have shown remarkable performance in tabular data prediction, there are still problems remaining for existing work to be effectively adapted into medical domain, such as ignoring unstructured free-texts and underutilizing the textual information in structured data. To address these issues, we propose PTransformer, a underline{P}rompt-based multimodal underline{Transformer} architecture designed specifically for medical tabular data. This framework consists of two critical components: a tabular cell embedding generator and a tabular transformer. The former efficiently encodes diverse modalities from both structured and unstructured tabular data into a harmonized language semantic space with the help of pre-trained sentence encoder and medical prompts. The latter integrates cell representations to generate patient embeddings for various medical tasks. In comprehensive experiments on two real-world datasets for three medical tasks, PTransformer demonstrated the improvements with 10.9%/11.0% on RMSE/MAE, 0.5%/2.2% on RMSE/MAE, and 1.6%/0.8% on BACC/AUROC compared to state-of-the-art (SOTA) baselines in predictability.
Problem

Research questions and friction points this paper is trying to address.

Improves medical tabular data prediction accuracy
Integrates unstructured free-texts with structured data
Utilizes medical prompts for better textual information encoding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-based multimodal Transformer for medical data
Integrates structured and unstructured tabular data
Uses pre-trained encoder and medical prompts
🔎 Similar Papers
2024-01-02IEEE International Conference on Bioinformatics and BiomedicineCitations: 0
Y
Y. Ruan
Saw Swee Hock School of Public Health, National University of Singapore, Singapore
Xiang Lan
Xiang Lan
NC state University
AI4SE
D
Daniel J. Tan
Institute of Data Science, National University of Singapore, Singapore
H
H. Abdullah
Department of Anaesthesiology, Singapore General Hospital, Singapore
M
Mengling Feng
Department of Anaesthesiology, Singapore General Hospital, Singapore