Towards Explainable Doctor Recommendation with Large Language Models

📅 2025-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak domain expertise and poor interpretability of existing physician recommendation systems, this paper proposes the first zero-shot, interpretable physician ranking framework tailored for online healthcare. Methodologically, it introduces a disease–treatment-driven pointwise ranking paradigm powered by large language models (LLMs), integrating domain-specific prompt engineering with zero-shot inference. We construct DrRank—the first disease–treatment-aligned physician ranking dataset—and propose a three-dimensional fairness evaluation covering individual, group, and structural fairness. Interpretation credibility is validated by three medical experts. Experiments show that our method achieves a +5.45 improvement in NDCG@10 over the strongest cross-encoder baseline on DrRank, with controllable latency. It thus effectively balances accuracy, interpretability, and fairness—advancing state-of-the-art in medically grounded, trustworthy physician recommendation.

Technology Category

Application Category

📝 Abstract
The advent of internet medicine provides patients with unprecedented convenience in searching and communicating with doctors relevant to their diseases and desired treatments online. However, the current doctor recommendation systems fail to fully ensure the professionalism and interpretability of the recommended results. In this work, we formulate doctor recommendation as a ranking task and develop a large language model (LLM)-based pointwise ranking framework. Our framework ranks doctors according to their relevance regarding specific diseases-treatment pairs in a zero-shot setting. The advantage of our framework lies in its ability to generate precise and explainable doctor ranking results. Additionally, we construct DrRank, a new expertise-driven doctor ranking dataset comprising over 38 disease-treatment pairs. Experiment results on the DrRank dataset demonstrate that our framework significantly outperforms the strongest cross-encoder baseline, achieving a notable gain of +5.45 in the NDCG@10 score while maintaining affordable latency consumption. Furthermore, we comprehensively present the fairness analysis results of our framework from three perspectives of different diseases, patient gender, and geographical regions. Meanwhile, the interpretability of our framework is rigorously verified by three human experts, providing further evidence of the reliability of our proposed framework for doctor recommendation.
Problem

Research questions and friction points this paper is trying to address.

Improves professionalism in doctor recommendation systems
Enhances interpretability of doctor ranking results
Addresses fairness across diseases, gender, and regions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based pointwise ranking framework
Zero-shot disease-treatment relevance ranking
Explainable and precise doctor recommendations
🔎 Similar Papers
No similar papers found.
Ziyang Zeng
Ziyang Zeng
Beijing University of Posts and Telecommunications
Information RetrievalLarge Language ModelReinforcement Learning
D
Dongyuan Li
Beijing University of Posts and Telecommunications, Beijing, China
Y
Yuqing Yang
Beijing University of Posts and Telecommunications, Beijing, China