ReCellTy: Domain-specific knowledge graph retrieval-augmented LLMs workflow for single-cell annotation

📅 2025-04-24
🏛️ bioRxiv
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current single-cell annotation methods suffer from low automation, misalignment with human cognitive reasoning, and suboptimal performance of general-purpose large language models (LLMs). Method: We propose a knowledge-driven, graph-augmented LLM framework. It constructs a cell-type–marker knowledge graph, integrates differential gene–guided graph-based retrieval (RAG), and performs multi-task fine-tuning of the LLM, augmented by a semantic similarity alignment optimization strategy. Contribution/Results: This work pioneers deep synergy between domain-specific knowledge graphs and LLMs, explicitly emulating human annotation cognition during inference. Evaluated across 11 tissue datasets, our method achieves up to a +0.21 improvement in human expert evaluation scores and a 6.1% gain in semantic consistency—significantly outperforming general-purpose LLMs.

Technology Category

Application Category

📝 Abstract
To enable precise and fully automated cell type annotation with large language models (LLMs), we developed a graph-structured feature–marker database to retrieve entities linked to differential genes for cell reconstruction. We further designed a multi-task workflow to optimize the annotation process. Compared to general-purpose LLMs, our method improves human evaluation scores by up to 0.21 and semantic similarity by 6.1% across 11 tissue types, while more closely aligning with the cognitive logic of manual annotation.
Problem

Research questions and friction points this paper is trying to address.

Enabling precise automated cell type annotation using LLMs
Developing a graph database for cell reconstruction markers
Optimizing annotation workflow to match manual cognitive logic
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph structured feature marker database for retrieval
Multi-task workflow optimizes annotation process
Improves human evaluation and semantic similarity scores
🔎 Similar Papers
No similar papers found.
D
Dezheng Han
School of Control Science and Engineering, Shandong University, Jinan, 250061, China
Y
Yibin Jia
Department of Radiation Oncology, Qilu Hospital of Shandong University, Jinan, 250012, China
Ruxiao Chen
Ruxiao Chen
Johns Hopkins University, PhD student
W
Wenjie Han
School of Control Science and Engineering, Shandong University, Jinan, 250061, China
Shuaishuai Guo
Shuaishuai Guo
Shandong University, Professor
Task-oriented communication6G communicationAI agentAI as a serviceAI for Science
Jianbo Wang
Jianbo Wang
Department of Radiation Oncology, Qilu Hospital of Shandong University, Jinan, 250012, China