Prostate Cancer Classification Using Multimodal Feature Fusion and Explainable AI

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of insufficient multimodal data fusion and limited model interpretability in prostate cancer diagnosis—particularly for intermediate-risk cases—this paper proposes a BERT–Random Forest feature-level fusion framework: clinical text is encoded using a lightweight BERT variant, while laboratory test values are modeled via Random Forest; features from both modalities are concatenated to enable complementary representation learning, and SHAP is employed for both global and local interpretability analysis. Evaluated on the PLCO-NIH dataset, the method achieves 98% accuracy, 99% AUC, and 89% F1-score, with recall for Stage II and III cancers improved to 90%, significantly outperforming unimodal baselines. Key contributions include: (i) the first integration of a lightweight BERT architecture with tree-based models for multimodal prostate cancer diagnosis; (ii) empirical validation of strong complementarity between clinical text and numerical laboratory data; and (iii) simultaneous achievement of high predictive accuracy, low computational overhead, and clinically meaningful interpretability.

Technology Category

Application Category

📝 Abstract
Prostate cancer, the second most prevalent male malignancy, requires advanced diagnostic tools. We propose an explainable AI system combining BERT (for textual clinical notes) and Random Forest (for numerical lab data) through a novel multimodal fusion strategy, achieving superior classification performance on PLCO-NIH dataset (98% accuracy, 99% AUC). While multimodal fusion is established, our work demonstrates that a simple yet interpretable BERT+RF pipeline delivers clinically significant improvements - particularly for intermediate cancer stages (Class 2/3 recall: 0.900 combined vs 0.824 numerical/0.725 textual). SHAP analysis provides transparent feature importance rankings, while ablation studies prove textual features' complementary value. This accessible approach offers hospitals a balance of high performance (F1=89%), computational efficiency, and clinical interpretability - addressing critical needs in prostate cancer diagnostics.
Problem

Research questions and friction points this paper is trying to address.

Improving prostate cancer classification using multimodal AI fusion
Enhancing interpretability in cancer diagnostics with explainable AI
Balancing performance and efficiency in clinical cancer detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal fusion of BERT and Random Forest
Explainable AI with SHAP feature analysis
High accuracy prostate cancer classification
🔎 Similar Papers
No similar papers found.
A
Asma Sadia Khan
Chittagong University of Engineering & Technology, University Road, Chittagong, 4349, Bangladesh.
F
Fariba Tasnia Khan
Southern University Bangladesh, Mehedibag Road, Chittagong, 4210, Bangladesh.
T
Tanjim Mahmud
Rangamati Science and Technology University, Rangamati Hill Tracts, Rangamati, 4500, Bangladesh.; Kitami Institute of Technology, 165 Koen-cho, Kitami, 090-8507, Japan.
S
Salman Karim Khan
Chittagong Medical College, KB Fazlul Kader Road, Chattogram, 4203, Bangladesh.
R
Rishita Chakma
Rangamati Science and Technology University, Rangamati Hill Tracts, Rangamati, 4500, Bangladesh.
N
Nahed Sharmen
Kitami Institute of Technology, 165 Koen-cho, Kitami, 090-8507, Japan.
Mohammad Shahadat Hossain
Mohammad Shahadat Hossain
Professor of Computer Science and Engineering, University of Chittagong| World top 2% Scientist
Artificial IntelligenceExpert SystemsSoft ComputingPervasive ComputingGIS
Karl Andersson
Karl Andersson
Professor, Luleå University of Technology
Cybersecurity