🤖 AI Summary
To address the challenge of tuberculosis (TB) screening in resource-limited settings, this paper proposes a lightweight, interpretable multimodal AI method that performs real-time, offline TB risk assessment on mobile devices using only cough audio recordings and basic demographic data. Methodologically, we introduce a cross-modal bidirectional cross-attention (CM-BCA) module to effectively fuse 1D-CNN–extracted audio features with structured demographic features from gradient-boosting trees, and propose a TB-risk-balanced loss (TRBL) to mitigate high-risk false negatives. Evaluated on real-world data from 1,105 participants across seven countries, our model achieves an AUROC of 0.903 and F1-score of 0.851—significantly outperforming existing approaches. Furthermore, the framework provides clinically interpretable outputs (e.g., feature-level risk attribution), ensuring transparency and facilitating deployment in primary healthcare settings.
📝 Abstract
Large-scale tuberculosis (TB) screening is limited by the high cost and operational complexity of traditional diagnostics, creating a need for artificial-intelligence solutions. We propose DeepGB-TB, a non-invasive system that instantly assigns TB risk scores using only cough audio and basic demographic data. The model couples a lightweight one-dimensional convolutional neural network for audio processing with a gradient-boosted decision tree for tabular features. Its principal innovation is a Cross-Modal Bidirectional Cross-Attention module (CM-BCA) that iteratively exchanges salient cues between modalities, emulating the way clinicians integrate symptoms and risk factors. To meet the clinical priority of minimizing missed cases, we design a Tuberculosis Risk-Balanced Loss (TRBL) that places stronger penalties on false-negative predictions, thereby reducing high-risk misclassifications. DeepGB-TB is evaluated on a diverse dataset of 1,105 patients collected across seven countries, achieving an AUROC of 0.903 and an F1-score of 0.851, representing a new state of the art. Its computational efficiency enables real-time, offline inference directly on common mobile devices, making it ideal for low-resource settings. Importantly, the system produces clinically validated explanations that promote trust and adoption by frontline health workers. By coupling AI innovation with public-health requirements for speed, affordability, and reliability, DeepGB-TB offers a tool for advancing global TB control.