🤖 AI Summary
Large language models (LLMs) struggle to identify “serendipitous discoveries”—unexpected yet scientifically valuable answers—in scientific knowledge graphs, particularly in drug repurposing. Method: We formalize the “serendipity-aware” knowledge graph question answering (KGQA) task, propose SerenQA—a unified framework integrating LLMs, knowledge graph retrieval, subgraph reasoning, and serendipity quantification—and introduce a principled evaluation metric balancing relevance, novelty, and surprise. We release an expert-annotated benchmark dataset and a three-stage evaluation protocol. Contribution/Results: Experiments reveal that while state-of-the-art LLMs excel at factual retrieval, they exhibit significant limitations in detecting truly unexpected, translationally viable drug-repurposing hypotheses. SerenQA establishes a reproducible benchmark and a novel paradigm for evaluating and advancing LLMs’ scientific insight capabilities—especially their capacity for serendipitous discovery in structured biomedical knowledge.
📝 Abstract
Large Language Models (LLMs) have greatly advanced knowledge graph question answering (KGQA), yet existing systems are typically optimized for returning highly relevant but predictable answers. A missing yet desired capacity is to exploit LLMs to suggest surprise and novel ("serendipitious") answers. In this paper, we formally define the serendipity-aware KGQA task and propose the SerenQA framework to evaluate LLMs' ability to uncover unexpected insights in scientific KGQA tasks. SerenQA includes a rigorous serendipity metric based on relevance, novelty, and surprise, along with an expert-annotated benchmark derived from the Clinical Knowledge Graph, focused on drug repurposing. Additionally, it features a structured evaluation pipeline encompassing three subtasks: knowledge retrieval, subgraph reasoning, and serendipity exploration. Our experiments reveal that while state-of-the-art LLMs perform well on retrieval, they still struggle to identify genuinely surprising and valuable discoveries, underscoring a significant room for future improvements. Our curated resources and extended version are released at: https://cwru-db-group.github.io/serenQA.