🤖 AI Summary
Open-source large language models (LLMs) and AI foundation models (AIFMs) exhibit suboptimal performance and low clinical credibility in personalized prescription generation for healthcare.
Method: We propose a retrieval-augmented generation (RAG)-enhanced open medical AI modeling framework. Specifically, we establish the first taxonomy of open medical AIFMs; improve prescription accuracy and interpretability via medical knowledge alignment and RAG; and introduce a clinician-led subjective evaluation framework to assess real-world utility.
Contribution/Results: Our approach significantly narrows the performance gap between open- and closed-source models in prescription generation, achieving clinical efficacy comparable to proprietary systems. It receives validation from domain experts in real-world settings. Furthermore, we systematically delineate ethical risk boundaries and propose a responsible deployment pathway grounded in clinical governance and model transparency. The framework advances trustworthy, clinically aligned, and ethically sound AI for precision therapeutics.
📝 Abstract
In response to the success of proprietary Large Language Models (LLMs) such as OpenAI's GPT-4, there is a growing interest in developing open, non-proprietary LLMs and AI foundation models (AIFMs) for transparent use in academic, scientific, and non-commercial applications. Despite their inability to match the refined functionalities of their proprietary counterparts, open models hold immense potential to revolutionize healthcare applications. In this paper, we examine the prospects of open-source LLMs and AIFMs for developing healthcare applications and make two key contributions. Firstly, we present a comprehensive survey of the current state-of-the-art open-source healthcare LLMs and AIFMs and introduce a taxonomy of these open AIFMs, categorizing their utility across various healthcare tasks. Secondly, to evaluate the general-purpose applications of open LLMs in healthcare, we present a case study on personalized prescriptions. This task is particularly significant due to its critical role in delivering tailored, patient-specific medications that can greatly improve treatment outcomes. In addition, we compare the performance of open-source models with proprietary models in settings with and without Retrieval-Augmented Generation (RAG). Our findings suggest that, although less refined, open LLMs can achieve performance comparable to proprietary models when paired with grounding techniques such as RAG. Furthermore, to highlight the clinical significance of LLMs-empowered personalized prescriptions, we perform subjective assessment through an expert clinician. We also elaborate on ethical considerations and potential risks associated with the misuse of powerful LLMs and AIFMs, highlighting the need for a cautious and responsible implementation in healthcare.