🤖 AI Summary
Existing DTI prediction methods struggle with the high complexity of biological systems and insufficient clinical interpretability. This paper introduces the first coordinator-based multi-agent large language model system for DTI prediction, integrating knowledge graph embeddings, machine learning–based prediction, and biomedical literature retrieval, while unifying Chain-of-Thought and ReAct reasoning frameworks to enable transparent, traceable, and multi-source heterogeneous data–driven joint inference. Key contributions include: (1) the first coordinator-driven multi-agent architecture specifically designed for DTI prediction; and (2) a human-readable, multi-evidence collaborative reasoning chain that significantly enhances clinical credibility and regulatory compliance support. On a kinase inhibitor dataset, the system achieves an F1 score of 0.514—45% higher than baseline methods. Ablation studies confirm that the AI agent module delivers the largest performance gain.
📝 Abstract
Advancements in large language models (LLMs) allow them to address diverse questions using human-like interfaces. Still, limitations in their training prevent them from answering accurately in scenarios that could benefit from multiple perspectives. Multi-agent systems allow the resolution of questions to enhance result consistency and reliability. While drug-target interaction (DTI) prediction is important for drug discovery, existing approaches face challenges due to complex biological systems and the lack of interpretability needed for clinical applications. DrugAgent is a multi-agent LLM system for DTI prediction that combines multiple specialized perspectives with transparent reasoning. Our system adapts and extends existing multi-agent frameworks by (1) applying coordinator-based architecture to the DTI domain, (2) integrating domain-specific data sources, including ML predictions, knowledge graphs, and literature evidence, and (3) incorporating Chain-of-Thought (CoT) and ReAct (Reason+Act) frameworks for transparent DTI reasoning. We conducted comprehensive experiments using a kinase inhibitor dataset, where our multi-agent LLM method outperformed the non-reasoning multi-agent model (GPT-4o mini) by 45% in F1 score (0.514 vs 0.355). Through ablation studies, we demonstrated the contributions of each agent, with the AI agent being the most impactful, followed by the KG agent and search agent. Most importantly, our approach provides detailed, human-interpretable reasoning for each prediction by combining evidence from multiple sources - a critical feature for biomedical applications where understanding the rationale behind predictions is essential for clinical decision-making and regulatory compliance. Code is available at https://anonymous.4open.science/r/DrugAgent-B2EA.