MolGA: Molecular Graph Adaptation with Pre-trained 2D Graph Encoder

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing pre-trained 2D graph encoders, though powerful, neglect submolecular structural knowledge (e.g., atoms, bonds), while mainstream molecular pre-training methods suffer from task-specific design, hindering flexible integration of diverse domain knowledge. To address this, we propose MolGA—a framework that, **without modifying the pre-trained encoder**, dynamically and finely fuses topological representations with heterogeneous molecular knowledge (e.g., functional groups, pharmacophores, geometric constraints) via a **molecular structure alignment strategy** and an **instance-level conditional adaptation mechanism**. MolGA preserves encoder generality while enhancing interpretability through knowledge-guided representation refinement. Evaluated on 11 molecular property prediction and biomedical benchmark datasets, MolGA consistently outperforms state-of-the-art baselines, demonstrating significant performance gains. It establishes a novel, efficient, and knowledge-enhanced paradigm for downstream molecular adaptation.

Technology Category

Application Category

📝 Abstract

Molecular graph representation learning is widely used in chemical and biomedical research. While pre-trained 2D graph encoders have demonstrated strong performance, they overlook the rich molecular domain knowledge associated with submolecular instances (atoms and bonds). While molecular pre-training approaches incorporate such knowledge into their pre-training objectives, they typically employ designs tailored to a specific type of knowledge, lacking the flexibility to integrate diverse knowledge present in molecules. Hence, reusing widely available and well-validated pre-trained 2D encoders, while incorporating molecular domain knowledge during downstream adaptation, offers a more practical alternative. In this work, we propose MolGA, which adapts pre-trained 2D graph encoders to downstream molecular applications by flexibly incorporating diverse molecular domain knowledge. First, we propose a molecular alignment strategy that bridge the gap between pre-trained topological representations with domain-knowledge representations. Second, we introduce a conditional adaptation mechanism that generates instance-specific tokens to enable fine-grained integration of molecular domain knowledge for downstream tasks. Finally, we conduct extensive experiments on eleven public datasets, demonstrating the effectiveness of MolGA.

Problem

Research questions and friction points this paper is trying to address.

Adapting pre-trained 2D graph encoders for molecular applications

Incorporating diverse molecular domain knowledge during adaptation

Bridging topological representations with domain-knowledge representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts pre-trained 2D graph encoders

Uses molecular alignment strategy

Implements conditional adaptation mechanism

🔎 Similar Papers

No similar papers found.

Authors to Follow