🤖 AI Summary
Medical literature frequently lacks evidence level (LoE) annotations, hindering the effectiveness of evidence-based retrieval. To address this, we propose the first automatic seven-level LoE classification framework for the full MEDLINE corpus (26 million articles). Our method integrates pretrained language models, domain-specific fine-tuning on biomedical text, and LoE-knowledge-guided feature enhancement to enable fine-grained LoE identification. Crucially, we introduce the first formalization of LoE as a computable filtering dimension in information retrieval, enabling principled integration of evidence hierarchy into search systems. Evaluated on the TREC Precision Medicine benchmark, our LoE classifier achieves significantly higher accuracy than baseline methods; incorporating LoE-aware filtering improves retrieval performance by 18.3%–24.7% in nDCG@10 and P@5, demonstrating that evidence-driven retrieval substantially enhances both result relevance and reliability.
📝 Abstract
In this paper, we present a new approach to improving the relevance and reliability of medical information retrieval, which builds upon the concept of Level of Evidence (LoE). The LoE framework categorizes medical publications into seven distinct levels based on the underlying empirical evidence. Despite LoE framework's relevance in medical research and evidence-based practice, only few medical publications explicitly state their LoE. Therefore, we develop a classification model for automatically assigning LoE to medical publications, which successfully classifies over 26 million documents in MEDLINE database into LoE classes. The subsequent retrieval experiments on the TREC Precision Medicine datasets show substantial improvements in retrieval relevance, when LoE is used as a search filter.