Dual-Modality Representation Learning for Molecular Property Prediction

📅 2025-01-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the limited representational capacity of single-modality models in molecular property prediction—particularly for drug-related properties—this paper proposes MolCross, a bimodal fusion framework that jointly leverages complementary molecular graph structures and SMILES sequences. Methodologically, we introduce a novel Dual-Modality Cross-Attention (DMCA) mechanism that enables dynamic, learnable cross-modal interactions between a Graph Neural Network (GNN) and a Transformer, avoiding naïve concatenation or static fusion strategies. Evaluated on eight standard molecular property prediction benchmarks—including both classification and regression tasks—MolCross consistently outperforms unimodal baselines and existing multimodal approaches, achieving state-of-the-art performance. These results empirically validate that fine-grained semantic alignment across modalities is critical for enhancing predictive accuracy in molecular property modeling.

Technology Category

Application Category

📝 Abstract

Molecular property prediction has attracted substantial attention recently. Accurate prediction of drug properties relies heavily on effective molecular representations. The structures of chemical compounds are commonly represented as graphs or SMILES sequences. Recent advances in learning drug properties commonly employ Graph Neural Networks (GNNs) based on the graph representation. For the SMILES representation, Transformer-based architectures have been adopted by treating each SMILES string as a sequence of tokens. Because each representation has its own advantages and disadvantages, combining both representations in learning drug properties is a promising direction. We propose a method named Dual-Modality Cross-Attention (DMCA) that can effectively combine the strengths of two representations by employing the cross-attention mechanism. DMCA was evaluated across eight datasets including both classification and regression tasks. Results show that our method achieves the best overall performance, highlighting its effectiveness in leveraging the complementary information from both graph and SMILES modalities.

Problem

Research questions and friction points this paper is trying to address.

Molecular Property Prediction

Graph Representation

SMILES Strings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Modal Cross Attention (DMCA)

Graph Neural Networks and Transformer Integration

Molecular Property Prediction

🔎 Similar Papers

No similar papers found.

Authors to Follow