Dual-Modality Representation Learning for Molecular Property Prediction

📅 2025-01-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited representational capacity of single-modality models in molecular property prediction—particularly for drug-related properties—this paper proposes MolCross, a bimodal fusion framework that jointly leverages complementary molecular graph structures and SMILES sequences. Methodologically, we introduce a novel Dual-Modality Cross-Attention (DMCA) mechanism that enables dynamic, learnable cross-modal interactions between a Graph Neural Network (GNN) and a Transformer, avoiding naïve concatenation or static fusion strategies. Evaluated on eight standard molecular property prediction benchmarks—including both classification and regression tasks—MolCross consistently outperforms unimodal baselines and existing multimodal approaches, achieving state-of-the-art performance. These results empirically validate that fine-grained semantic alignment across modalities is critical for enhancing predictive accuracy in molecular property modeling.

Technology Category

Application Category

📝 Abstract
Molecular property prediction has attracted substantial attention recently. Accurate prediction of drug properties relies heavily on effective molecular representations. The structures of chemical compounds are commonly represented as graphs or SMILES sequences. Recent advances in learning drug properties commonly employ Graph Neural Networks (GNNs) based on the graph representation. For the SMILES representation, Transformer-based architectures have been adopted by treating each SMILES string as a sequence of tokens. Because each representation has its own advantages and disadvantages, combining both representations in learning drug properties is a promising direction. We propose a method named Dual-Modality Cross-Attention (DMCA) that can effectively combine the strengths of two representations by employing the cross-attention mechanism. DMCA was evaluated across eight datasets including both classification and regression tasks. Results show that our method achieves the best overall performance, highlighting its effectiveness in leveraging the complementary information from both graph and SMILES modalities.
Problem

Research questions and friction points this paper is trying to address.

Molecular Property Prediction
Graph Representation
SMILES Strings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Modal Cross Attention (DMCA)
Graph Neural Networks and Transformer Integration
Molecular Property Prediction
🔎 Similar Papers
No similar papers found.
A
Anyin Zhao
Case Western Reserve University, Cleveland OH 44106, USA
Z
Zuquan Chen
Case Western Reserve University, Cleveland OH 44106, USA
Zhengyu Fang
Zhengyu Fang
Case Western Reserve University
Machine learningDeep LearningGen AITime-SeriesAI for Science
Xiaoge Zhang
Xiaoge Zhang
The Hong Kong Polytechnic University
Artificial IntelligenceRisk and ReliabilityData ScienceUncertainty Quantification
J
Jing Li
Case Western Reserve University, Cleveland OH 44106, USA