🤖 AI Summary
To address the low accuracy of medical question answering (QA) in low-resource languages, this paper proposes a word-level cross-lingual knowledge alignment method that lightweightly integrates English medical knowledge graphs (KGs) into large language models (LLMs) to enable precise cross-lingual reasoning. Our key contributions are threefold: (1) a novel translation-driven word-level KG alignment mechanism; (2) a lightweight retrieval framework integrating KG embeddings, retrieval-augmented generation (RAG), and multi-perspective semantic ranking; and (3) a cache-optimized strategy enabling millisecond-scale response times. Evaluated on medical QA benchmarks across Chinese, Japanese, Korean, and Swahili, our approach achieves up to a 33.89% absolute accuracy improvement over baselines, with an average retrieval latency of only 0.0009 seconds—significantly outperforming existing cross-lingual medical QA methods.
📝 Abstract
Large Language Models (LLMs) have shown remarkable progress in medical question answering (QA), yet their effectiveness remains predominantly limited to English due to imbalanced multilingual training data and scarce medical resources for low-resource languages. To address this critical language gap in medical QA, we propose Multilingual Knowledge Graph-based Retrieval Ranking (MKG-Rank), a knowledge graph-enhanced framework that enables English-centric LLMs to perform multilingual medical QA. Through a word-level translation mechanism, our framework efficiently integrates comprehensive English-centric medical knowledge graphs into LLM reasoning at a low cost, mitigating cross-lingual semantic distortion and achieving precise medical QA across language barriers. To enhance efficiency, we introduce caching and multi-angle ranking strategies to optimize the retrieval process, significantly reducing response times and prioritizing relevant medical knowledge. Extensive evaluations on multilingual medical QA benchmarks across Chinese, Japanese, Korean, and Swahili demonstrate that MKG-Rank consistently outperforms zero-shot LLMs, achieving maximum 33.89% increase in accuracy, while maintaining an average retrieval time of only 0.0009 seconds.