🤖 AI Summary
Existing DDI datasets predominantly rely on textual data, neglecting multimodal biomedical information that reflects pharmacodynamic mechanisms. To address this, we introduce MUDI—the first large-scale pharmacodynamics-oriented multimodal DDI dataset comprising over 310,000 drug pairs—integrating four modalities: pharmacological text, chemical formulas, molecular graphs, and microscopy/schematic images. We annotate three interaction types: synergistic, antagonistic, and novel effects, and enforce a strict zero-shot drug-pair split to rigorously evaluate real-world generalization. We propose a multimodal fusion framework combining graph neural networks, text encoders, image CNNs, and a cross-modal alignment module. Experiments demonstrate that intermediate-layer fusion significantly outperforms late fusion, achieving 72.4% accuracy on unseen drug pairs. All data, annotations, code, and baseline models are publicly released, establishing the first unified multimodal DDI benchmark and enabling a new paradigm for AI-driven safe medication use.
📝 Abstract
Understanding the interaction between different drugs (drug-drug interaction or DDI) is critical for ensuring patient safety and optimizing therapeutic outcomes. Existing DDI datasets primarily focus on textual information, overlooking multimodal data that reflect complex drug mechanisms. In this paper, we (1) introduce MUDI, a large-scale Multimodal biomedical dataset for Understanding pharmacodynamic Drug-drug Interactions, and (2) benchmark learning methods to study it. In brief, MUDI provides a comprehensive multimodal representation of drugs by combining pharmacological text, chemical formulas, molecular structure graphs, and images across 310,532 annotated drug pairs labeled as Synergism, Antagonism, or New Effect. Crucially, to effectively evaluate machine-learning based generalization, MUDI consists of unseen drug pairs in the test set. We evaluate benchmark models using both late fusion voting and intermediate fusion strategies. All data, annotations, evaluation scripts, and baselines are released under an open research license.