Robust Classification of Oral Cancer with Limited Training Data

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Oral cancer mortality remains high in resource-limited settings due to inadequate healthcare infrastructure, scarcity of specialized personnel, and limited availability of high-quality annotated data for early diagnosis. To address these challenges, we propose a lightweight hybrid model integrating convolutional neural networks with Bayesian deep learning, employing variational inference for principled uncertainty quantification. This design significantly enhances classification reliability and generalization under small-sample conditions. The model is trained exclusively on routine smartphone-captured color oral images—requiring no specialized imaging equipment. It achieves 94% accuracy on in-distribution test data and 88% on out-of-distribution, multi-institutional real-world datasets—substantially outperforming conventional CNNs (72.94%). Moreover, it demonstrates superior confidence calibration and robust misclassification detection. Our approach establishes a new paradigm for deployable, trustworthy oral cancer screening in low-resource environments.

Technology Category

Application Category

📝 Abstract
Oral cancer ranks among the most prevalent cancers globally, with a particularly high mortality rate in regions lacking adequate healthcare access. Early diagnosis is crucial for reducing mortality; however, challenges persist due to limited oral health programs, inadequate infrastructure, and a shortage of healthcare practitioners. Conventional deep learning models, while promising, often rely on point estimates, leading to overconfidence and reduced reliability. Critically, these models require large datasets to mitigate overfitting and ensure generalizability, an unrealistic demand in settings with limited training data. To address these issues, we propose a hybrid model that combines a convolutional neural network (CNN) with Bayesian deep learning for oral cancer classification using small training sets. This approach employs variational inference to enhance reliability through uncertainty quantification. The model was trained on photographic color images captured by smartphones and evaluated on three distinct test datasets. The proposed method achieved 94% accuracy on a test dataset with a distribution similar to that of the training data, comparable to traditional CNN performance. Notably, for real-world photographic image data, despite limitations and variations differing from the training dataset, the proposed model demonstrated superior generalizability, achieving 88% accuracy on diverse datasets compared to 72.94% for traditional CNNs, even with a smaller dataset. Confidence analysis revealed that the model exhibits low uncertainty (high confidence) for correctly classified samples and high uncertainty (low confidence) for misclassified samples. These results underscore the effectiveness of Bayesian inference in data-scarce environments in enhancing early oral cancer diagnosis by improving model reliability and generalizability.
Problem

Research questions and friction points this paper is trying to address.

Classifying oral cancer using limited training data with Bayesian deep learning
Improving reliability and generalizability in data-scarce medical diagnosis settings
Addressing overconfidence in conventional deep learning models through uncertainty quantification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid CNN-Bayesian model for oral cancer classification
Uses variational inference for uncertainty quantification
Achieves high accuracy with limited training data
🔎 Similar Papers
No similar papers found.
A
Akshay Bhagwan Sonawane
The University of Texas at Dallas, Richardson, TX 75080, USA
L
Lena D. Swamikannan
The University of Texas at Dallas, Richardson, TX 75080, USA
Lakshman Tamil
Lakshman Tamil
Professor of Electrical Engineering, University of Texas at Dallas
AI in MedicineECGSleep apnea/QualityCOPD/AsthmaCHF