HSA-Net: Hierarchical and Structure-Aware Framework for Efficient and Scalable Molecular Language Modeling

📅 2025-08-10

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

In molecular representation learning, deep graph neural networks (GNNs) suffer from over-smoothing, causing node feature collapse; existing projection mechanisms—e.g., cross-attention—struggle to jointly preserve deep global topology and shallow fine-grained structural details. To address this, we propose HSA-Net: a hierarchical structure-aware framework that innovatively integrates Graph-Mamba with cross-attention, introduces a hierarchical adaptive projector to retain multi-scale topological information, and incorporates a source-aware fusion module to dynamically balance global and local features. This design eliminates the conventional global–local trade-off and substantially enhances feature discriminability. Extensive experiments on multiple molecular property prediction and generation benchmarks demonstrate that HSA-Net consistently outperforms state-of-the-art models, validating its superior representation quality and generalization capability.

Technology Category

Application Category

📝 Abstract

Molecular representation learning, a cornerstone for downstream tasks like molecular captioning and molecular property prediction, heavily relies on Graph Neural Networks (GNN). However, GNN suffers from the over-smoothing problem, where node-level features collapse in deep GNN layers. While existing feature projection methods with cross-attention have been introduced to mitigate this issue, they still perform poorly in deep features. This motivated our exploration of using Mamba as an alternative projector for its ability to handle complex sequences. However, we observe that while Mamba excels at preserving global topological information from deep layers, it neglects fine-grained details in shallow layers. The capabilities of Mamba and cross-attention exhibit a global-local trade-off. To resolve this critical global-local trade-off, we propose Hierarchical and Structure-Aware Network (HSA-Net), a novel framework with two modules that enables a hierarchical feature projection and fusion. Firstly, a Hierarchical Adaptive Projector (HAP) module is introduced to process features from different graph layers. It learns to dynamically switch between a cross-attention projector for shallow layers and a structure-aware Graph-Mamba projector for deep layers, producing high-quality, multi-level features. Secondly, to adaptively merge these multi-level features, we design a Source-Aware Fusion (SAF) module, which flexibly selects fusion experts based on the characteristics of the aggregation features, ensuring a precise and effective final representation fusion. Extensive experiments demonstrate that our HSA-Net framework quantitatively and qualitatively outperforms current state-of-the-art (SOTA) methods.

Problem

Research questions and friction points this paper is trying to address.

Overcoming over-smoothing in deep GNN layers

Balancing global-local trade-off in molecular representation

Enhancing feature fusion for molecular language modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Adaptive Projector for multi-level features

Graph-Mamba projector preserves global topology

Source-Aware Fusion adaptively merges features

🔎 Similar Papers

3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization