๐ค AI Summary
In medical question answering, large language models (LLMs) struggle to reliably capture implicit semantic relationships among biomedical concepts, limiting their domain-specific reasoning capabilities. To address this, we propose a lightweight, parameter-efficient knowledge graph embedding (KGE) fusion mechanism: leveraging frozen backbone medical LLMs (BioMistral-7B/MediTron-7B), we introduce a trainable mapping network that dynamically injects structured KGEsโe.g., TransE or RotatEโinto the LLM without fine-tuning the LLM or the graph encoder, and with robustness to encoder choice. This work provides the first empirical evidence that LLMs can directly parse and benefit from structured biomedical KGEs. Evaluated on four medical multiple-choice benchmarks, our method achieves average accuracy improvements of +6.7% and +9.9% over respective baselines, significantly outperforming existing specialized medical LLMs.
๐ Abstract
Question answering is a natural language understanding task that involves reasoning over both explicit context, and unstated relevant domain knowledge. Despite the high cost of training, large language models (LLMs) -- the backbone of most modern question-answering systems -- still struggle to reliably capture the nuanced relationships between concepts that are crucial for reasoning in specialized fields like medicine. In this work, we present MEG, a parameter-efficient approach for medical knowledge-augmented LLMs. MEG uses a lightweight mapping network to incorporate knowledge graph embeddings into the LLM, enabling it to leverage external knowledge in a cost-effective way. We evaluate our method on four popular medical multiple-choice datasets and show that LLMs i) can effectively interpret knowledge graph embeddings and ii) gain significant advantages from the factual grounding these embeddings provide. MEG attains an average of +6.7% and +9.9% accuracy over specialized models like BioMistral-7B and MediTron-7B, respectively. Finally, we show that MEG's performance remains robust to the choice of graph encoder.