AuscultaBase: A Foundational Step Towards AI-Powered Body Sound Diagnostics

📅 2024-11-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current auscultation-based diagnosis suffers from substantial inter-observer variability and poor generalizability of AI models in resource-constrained settings, hindering early disease detection. To address these challenges, we introduce the first integrated foundational framework for cardiorespiratory and bowel sound analysis—comprising (1) AuscultaBase-Corpus, a large-scale, multi-source acoustic corpus (322+ hours); (2) AuscultaBase-Model, a contrastive learning–driven universal body-sound foundation model; and (3) AuscultaBase-Bench, a unified evaluation benchmark covering 16 diagnostic subtasks. Leveraging multi-source data fusion, self-supervised contrastive pretraining, and cross-domain transfer evaluation, our framework achieves statistically significant improvements over existing open-source acoustic pretrained models on 12 of 16 tasks. It markedly enhances cardiac, respiratory, and bowel sound classification, anomaly detection, and lesion localization—advancing objective, standardized phonocardiographic and phonopneumographic analysis.

Technology Category

Application Category

📝 Abstract
Auscultation of internal body sounds is essential for diagnosing a range of health conditions, yet its effectiveness is often limited by clinicians' expertise and the acoustic constraints of human hearing, restricting its use across various clinical scenarios. To address these challenges, we introduce AuscultaBase, a foundational framework aimed at advancing body sound diagnostics through innovative data integration and contrastive learning techniques. Our contributions include the following: First, we compile AuscultaBase-Corpus, a large-scale, multi-source body sound database encompassing 11 datasets with 40,317 audio recordings and totaling 322.4 hours of heart, lung, and bowel sounds. Second, we develop AuscultaBase-Model, a foundational diagnostic model for body sounds, utilizing contrastive learning on the compiled corpus. Third, we establish AuscultaBase-Bench, a comprehensive benchmark containing 16 sub-tasks, assessing the performance of various open-source acoustic pre-trained models. Evaluation results indicate that our model outperforms all other open-source models in 12 out of 16 tasks, demonstrating the efficacy of our approach in advancing diagnostic capabilities for body sound analysis.
Problem

Research questions and friction points this paper is trying to address.

Improves accuracy in auscultation-based disease diagnostics
Reduces variability in body sound interpretation by clinicians
Overcomes limitations of non-representative AI training data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised and contrastive learning techniques
Large-scale multi-source data integration
Robust feature representations for sound analysis
🔎 Similar Papers
No similar papers found.
Pingjie Wang
Pingjie Wang
Shanghai Jiao Tong University
Model CompressionInference Acceleration
Zihan Zhao
Zihan Zhao
Shanghai Jiao Tong University
NLP
L
Liudan Zhao
Xinhua Hospital Affiliated To Shanghai Jiao Tong University School of Medicine
M
Miao He
Tongji University
X
Xin Sun
Xinhua Hospital Affiliated To Shanghai Jiao Tong University School of Medicine
Ya Zhang
Ya Zhang
Shanghai Jiao Tong University
Machine learningComputer visionMedical Imaging
K
Kun Sun
Xinhua Hospital Affiliated To Shanghai Jiao Tong University School of Medicine
Yanfeng Wang
Yanfeng Wang
Shanghai Jiao Tong University
Y
Yu Wang
Shanghai Jiao Tong University, Shanghai Artificial Intelligence Laboratory