🤖 AI Summary
The surge in COVID-19 literature has exposed limitations of conventional retrieval systems, which rely on shallow syntactic parsing tools and struggle to identify latent semantic relationships in unlabeled text, thereby constraining knowledge discovery. To address this, we propose the first LLM-driven implicit relation extraction method specifically designed for COVID-19 literature. Our approach deeply integrates large language models into the retrieval pipeline and couples them with the Covrelex-SE framework to build a semantics-enhanced retrieval system. Crucially, it operates without manual annotation, automatically uncovering deep entity-level associations—such as “drug–target–pathway”—beyond surface-level syntactic patterns. Experimental results demonstrate significant improvements in both precision and recall, markedly enhancing the relevance and conceptual depth of retrieved documents. This advancement delivers more accurate, insightful, and knowledge-rich literature support to researchers.
📝 Abstract
In recent years, with the appearance of the COVID-19 pandemic, numerous publications relevant to this disease have been issued. Because of the massive volume of publications, an efficient retrieval system is necessary to provide researchers with useful information if an unexpected pandemic happens so suddenly, like COVID-19. In this work, we present a method to help the retrieval system, the Covrelex-SE system, to provide more high-quality search results. We exploited the power of the large language models (LLMs) to extract the hidden relationships inside the unlabeled publication that cannot be found by the current parsing tools that the system is using. Since then, help the system to have more useful information during retrieval progress.