EZhouNet:A framework based on graph neural network and anchor interval for the respiratory sound event detection

📅 2025-09-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing respiratory sound event detection methods suffer from three key limitations: (1) reliance on frame-level predictions followed by post-processing, hindering precise boundary modeling; (2) adoption of fixed-length input segments, limiting adaptability to variable-length clinical recordings; and (3) neglect of positional information within the respiratory cycle, which is critical for distinguishing pathological patterns. To address these, we propose an end-to-end framework integrating Graph Neural Networks (GNNs) with anchor-based temporal intervals. The GNN captures fine-grained inter-frame temporal dependencies; learnable anchors directly regress event onset and offset boundaries; and respiratory-phase positional features are explicitly encoded to enhance anomaly discrimination. The architecture natively supports variable-length inputs, improving clinical deployability. Evaluated on SPRSound 2024 and HF Lung V1, our method achieves significant improvements in event detection F1-score and temporal localization accuracy (mAP +8.3%), validating the effectiveness of explicit boundary regression and position-aware modeling.

Technology Category

Application Category

📝 Abstract
Auscultation is a key method for early diagnosis of respiratory and pulmonary diseases, relying on skilled healthcare professionals. However, the process is often subjective, with variability between experts. As a result, numerous deep learning-based automatic classification methods have emerged, most of which focus on respiratory sound classification. In contrast, research on respiratory sound event detection remains limited. Existing sound event detection methods typically rely on frame-level predictions followed by post-processing to generate event-level outputs, making interval boundaries challenging to learn directly. Furthermore, many approaches can only handle fixed-length audio, lim- iting their applicability to variable-length respiratory sounds. Additionally, the impact of respiratory sound location information on detection performance has not been extensively explored. To address these issues, we propose a graph neural network-based framework with anchor intervals, capable of handling variable-length audio and providing more precise temporal localization for abnormal respi- ratory sound events. Our method improves both the flexibility and applicability of respiratory sound detection. Experiments on the SPRSound 2024 and HF Lung V1 datasets demonstrate the effec- tiveness of the proposed approach, and incorporating respiratory position information enhances the discrimination between abnormal sounds.
Problem

Research questions and friction points this paper is trying to address.

Detecting respiratory sound events with precise temporal localization
Handling variable-length audio inputs for respiratory sounds
Incorporating respiratory position information to improve detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph neural network with anchor intervals
Handles variable-length respiratory audio
Incorporates respiratory position information enhancement
🔎 Similar Papers
No similar papers found.
Y
Yun Chu
Q
Qiuhao Wang
E
Enze Zhou
Q
Qian Liu
Gang Zheng
Gang Zheng
Inria
controlrobotics