Organ-Aware Attention Improves CT Triage and Classification

📅 2026-01-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of triaging and classifying high-throughput CT images, which are hindered by the complexity of 3D anatomical structures, variability in scanning protocols, and noisy report annotations—factors that limit existing vision-language models in balancing performance and interpretability. To overcome these limitations, the authors propose ORACLE-CT, a framework featuring an encoder-agnostic, organ-aware head that integrates organ mask attention with lightweight volumetric and Hounsfield Unit (HU) scalar features. This design enables accurate classification and generation of localized evidence under a unified evaluation protocol. Evaluated on the CT-RATE chest dataset (AUROC 0.86) and the abdominal MERLIN dataset encompassing 30 pathological findings (AUROC 0.85), ORACLE-CT outperforms both supervised and zero-shot vision-language models, achieving state-of-the-art performance in cross-anatomical-region CT classification.

Technology Category

Application Category

📝 Abstract
There is an urgent need for triage and classification of high-volume medical imaging modalities such as computed tomography (CT), which can improve patient care and mitigate radiologist burnout. Study-level CT triage requires calibrated predictions with localized evidence; however, off-the-shelf Vision Language Models (VLM) struggle with 3D anatomy, protocol shifts, and noisy report supervision. This study used the two largest publicly available chest CT datasets: CT-RATE and RADCHEST-CT (held-out external test set). Our carefully tuned supervised baseline (instantiated as a simple Global Average Pooling head) establishes a new supervised state of the art, surpassing all reported linear-probe VLMs. Building on this baseline, we present ORACLE-CT, an encoder-agnostic, organ-aware head that pairs Organ-Masked Attention (mask-restricted, per-organ pooling that yields spatial evidence) with Organ-Scalar Fusion (lightweight fusion of normalized volume and mean-HU cues). In the chest setting, ORACLE-CT masked attention model achieves AUROC 0.86 on CT-RATE; in the abdomen setting, on MERLIN (30 findings), our supervised baseline exceeds a reproduced zero-shot VLM baseline obtained by running publicly released weights through our pipeline, and adding masked attention plus scalar fusion further improves performance to AUROC 0.85. Together, these results deliver state-of-the-art supervised classification performance across both chest and abdomen CT under a unified evaluation protocol. The source code is available at https://github.com/lavsendahal/oracle-ct.
Problem

Research questions and friction points this paper is trying to address.

CT triage
medical image classification
3D anatomy
noisy report supervision
protocol shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Organ-Aware Attention
CT Triage
Organ-Masked Attention
Organ-Scalar Fusion
Vision Language Models
🔎 Similar Papers
No similar papers found.