An Integrated Deep Learning Framework Leveraging NASNet and Vision Transformer with MixProcessing for Accurate and Precise Diagnosis of Lung Diseases

📅 2025-02-27

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This work addresses the precise differential diagnosis of five thoracic conditions—pneumonia, tuberculosis, COVID-19, lung cancer, and normal cases—from chest radiographs. Methodologically, we propose a lightweight, dual-backbone deep learning framework featuring a novel NASNet-ViT collaborative architecture that synergistically integrates NASNet’s local texture representation with ViT’s global contextual modeling. We further introduce a Wavelet-AHE-Morphology joint preprocessing strategy (MixProcessing) to enhance lesion contrast and structural robustness, and employ end-to-end hybrid feature fusion and classification training. Evaluated on public chest X-ray datasets, our model achieves 98.9% accuracy, 0.990 sensitivity, 0.987 specificity, and 0.989 F1-score, with a compact size of only 25.6 MB and an inference time of 12.4 ms per image—substantially outperforming state-of-the-art models including MixNet-LD and D-ResNet.

Technology Category

Application Category

📝 Abstract

The lungs are the essential organs of respiration, and this system is significant in the carbon dioxide and exchange between oxygen that occurs in human life. However, several lung diseases, which include pneumonia, tuberculosis, COVID-19, and lung cancer, are serious healthiness challenges and demand early and precise diagnostics. The methodological study has proposed a new deep learning framework called NASNet-ViT, which effectively incorporates the convolution capability of NASNet with the global attention mechanism capability of Vision Transformer ViT. The proposed model will classify the lung conditions into five classes: Lung cancer, COVID-19, pneumonia, TB, and normal. A sophisticated multi-faceted preprocessing strategy called MixProcessing has been used to improve diagnostic accuracy. This preprocessing combines wavelet transform, adaptive histogram equalization, and morphological filtering techniques. The NASNet-ViT model performs at state of the art, achieving an accuracy of 98.9%, sensitivity of 0.99, an F1-score of 0.989, and specificity of 0.987, outperforming other state of the art architectures such as MixNet-LD, D-ResNet, MobileNet, and ResNet50. The model's efficiency is further emphasized by its compact size, 25.6 MB, and a low computational time of 12.4 seconds, hence suitable for real-time, clinically constrained environments. These results reflect the high-quality capability of NASNet-ViT in extracting meaningful features and recognizing various types of lung diseases with very high accuracy. This work contributes to medical image analysis by providing a robust and scalable solution for diagnostics in lung diseases.

Problem

Research questions and friction points this paper is trying to address.

Develops NASNet-ViT for accurate lung disease diagnosis.

Classifies lung conditions into five categories effectively.

Enhances diagnostic accuracy with MixProcessing preprocessing strategy.

Innovation

Methods, ideas, or system contributions that make the work stand out.

NASNet-ViT combines NASNet and Vision Transformer

MixProcessing uses wavelet transform and filtering

Achieves 98.9% accuracy in lung disease diagnosis

🔎 Similar Papers

Developing a Dual-Stage Vision Transformer Model for Lung Disease Classification