DGSAN: Dual-Graph Spatiotemporal Attention Network for Pulmonary Nodule Malignancy Prediction

📅 2025-12-23

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the low efficiency of fusing multi-modal (CT, clinical, follow-up) and multi-temporal imaging data for predicting pulmonary nodule malignancy. We propose a spatiotemporal-aware cross-modal fusion framework. Methodologically, we design a global-local feature encoder, construct intra- and inter-modal dual graph structures, and introduce a hierarchical cross-modal graph fusion mechanism coupled with a spatiotemporal attention module—overcoming limitations of conventional concatenation and single-layer attention. Key contributions include: (1) the first dual-graph modeling paradigm for multi-modal, multi-temporal nodule analysis; (2) the NLST-cmst dataset—the first publicly available multi-modal, multi-temporal pulmonary nodule benchmark; and (3) state-of-the-art performance on NLST-cmst and its CSTL-derived subsets, achieving significant improvements in classification accuracy while maintaining efficient inference.

Technology Category

Application Category

📝 Abstract

Lung cancer continues to be the leading cause of cancer-related deaths globally. Early detection and diagnosis of pulmonary nodules are essential for improving patient survival rates. Although previous research has integrated multimodal and multi-temporal information, outperforming single modality and single time point, the fusion methods are limited to inefficient vector concatenation and simple mutual attention, highlighting the need for more effective multimodal information fusion. To address these challenges, we introduce a Dual-Graph Spatiotemporal Attention Network, which leverages temporal variations and multimodal data to enhance the accuracy of predictions. Our methodology involves developing a Global-Local Feature Encoder to better capture the local, global, and fused characteristics of pulmonary nodules. Additionally, a Dual-Graph Construction method organizes multimodal features into inter-modal and intra-modal graphs. Furthermore, a Hierarchical Cross-Modal Graph Fusion Module is introduced to refine feature integration. We also compiled a novel multimodal dataset named the NLST-cmst dataset as a comprehensive source of support for related research. Our extensive experiments, conducted on both the NLST-cmst and curated CSTL-derived datasets, demonstrate that our DGSAN significantly outperforms state-of-the-art methods in classifying pulmonary nodules with exceptional computational efficiency.

Problem

Research questions and friction points this paper is trying to address.

Predicts pulmonary nodule malignancy using dual-graph spatiotemporal attention

Enhances multimodal data fusion beyond simple concatenation and mutual attention

Improves classification accuracy and computational efficiency for early lung cancer detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Graph Spatiotemporal Attention Network for prediction

Global-Local Feature Encoder captures nodule characteristics

Hierarchical Cross-Modal Graph Fusion refines feature integration

🔎 Similar Papers

No similar papers found.

Authors to Follow