Hierarchical Section Matching Prediction (HSMP) BERT for Fine-Grained Extraction of Structured Data from Hebrew Free-Text Radiology Reports in Crohn's Disease

📅 2025-09-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of identifying sparse multi-organ lesions in Hebrew-language radiology reports for Crohn’s disease and poor structured information extraction performance in low-resource languages, this paper proposes HSMP-BERT, a hierarchical prompt-learning model. It integrates hierarchical multi-label classification with cross-organ–pathology fine-grained matching and supports joint zero-shot and full-finetuning training. Evaluated on 24 organ–finding combinations, HSMP-BERT achieves an average F1 score of 0.83—significantly outperforming baselines—while accelerating inference by 5.1×. We establish a multidimensional evaluation framework incorporating F1, Cohen’s κ, and AUC, enabling discovery of key clinical associations (e.g., ileal wall thickening and stenosis) and demographic trends (e.g., age and sex distributions). This work provides a scalable technical foundation for large-scale epidemiological studies of gastrointestinal diseases in low-resource language settings.

Technology Category

Application Category

📝 Abstract
Extracting structured clinical information from radiology reports is challenging, especially in low-resource languages. This is pronounced in Crohn's disease, with sparsely represented multi-organ findings. We developed Hierarchical Structured Matching Prediction BERT (HSMP-BERT), a prompt-based model for extraction from Hebrew radiology text. In an administrative database study, we analyzed 9,683 reports from Crohn's patients imaged 2010-2023 across Israeli providers. A subset of 512 reports was radiologist-annotated for findings across six gastrointestinal organs and 15 pathologies, yielding 90 structured labels per subject. Multilabel-stratified split (66% train+validation; 33% test), preserving label prevalence. Performance was evaluated with accuracy, F1, Cohen's $κ$, AUC, PPV, NPV, and recall. On 24 organ-finding combinations with $>$15 positives, HSMP-BERT achieved mean F1 0.83$pm$0.08 and $κ$ 0.65$pm$0.17, outperforming the SMP zero-shot baseline (F1 0.49$pm$0.07, $κ$ 0.06$pm$0.07) and standard fine-tuning (F1 0.30$pm$0.27, $κ$ 0.27$pm$0.34; paired t-test $p < 10^{-7}$). Hierarchical inference cuts runtime 5.1$ imes$ vs. traditional inference. Applied to all reports, it revealed associations among ileal wall thickening, stenosis, and pre-stenotic dilatation, plus age- and sex-specific trends in inflammatory findings. HSMP-BERT offers a scalable solution for structured extraction in radiology, enabling population-level analysis of Crohn's disease and demonstrating AI's potential in low-resource settings.
Problem

Research questions and friction points this paper is trying to address.

Extracting structured data from Hebrew radiology reports
Handling sparse multi-organ findings in Crohn's disease
Overcoming challenges in low-resource language processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

HSMP-BERT model for Hebrew radiology text extraction
Prompt-based hierarchical structured matching prediction approach
Multilabel-stratified evaluation with superior F1 and kappa performance
🔎 Similar Papers
No similar papers found.
Z
Zvi Badash
Faculty of Data and Decision Sciences, Technion – Israel Institute of Technology, Haifa, Israel
H
Hadas Ben-Atya
Faculty of Biomedical Engineering, Technion – Israel Institute of Technology, Haifa, Israel
N
Naama Gavrielov
Faculty of Biomedical Engineering, Technion – Israel Institute of Technology, Haifa, Israel
L
Liam Hazan
Faculty of Data and Decision Sciences, Technion – Israel Institute of Technology, Haifa, Israel
G
Gili Focht
Juliet Keidan Institute of Pediatric Gastroenterology, Hepatology and Nutrition, The Eisenberg R&D Authority, Shaare Zedek Medical Center, The Hebrew University School of Medicine, Jerusalem, Israel
R
Ruth Cytter-Kuint
Pediatric Radiology Unit, Radiology Department, The Eisenberg R&D Authority, Shaare Zedek Medical Center, The Hebrew University of Jerusalem, Jerusalem, Israel
T
Talar Hagopian
Pediatric Radiology Unit, Radiology Department, The Eisenberg R&D Authority, Shaare Zedek Medical Center, The Hebrew University of Jerusalem, Jerusalem, Israel
Dan Turner
Dan Turner
Juliet Keidan Institute of Pediatric Gastroenterology, Hepatology and Nutrition, The Eisenberg R&D Authority, Shaare Zedek Medical Center, The Hebrew University School of Medicine, Jerusalem, Israel
Moti Freiman
Moti Freiman
Biomedical Engineering, Technion - Israel Institute of Technology
Medical ImagingMedical Image AnalysisQuantitative Imaging