A Multi-modal Agentic Co-pilot for Evidence Grounded Computational Pathology

πŸ“… 2026-06-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the limited integration of artificial intelligence in pathology with evidence-based medicine, particularly the prevailing reliance on unimodal text lacking traceable evidence. The authors introduce, for the first time, a hierarchical evidence framework into computational pathology, constructing the most comprehensive multimodal pathology evidence corpus and a hypergraph-based knowledge engine to date. They propose a multi-agent collaborative reasoning framework that enables interpretable and traceable diagnostic inference from textual queries to specific regions in whole-slide images (WSIs). By integrating multimodal large language models, evidence retrieval, and WSI understanding, the method significantly outperforms existing approaches across more than 200,000 real-world cases. User studies demonstrate that the system effectively enhances pathologists’ diagnostic accuracy and decision confidence.
πŸ“ Abstract
Pathology is the cornerstone of modern medicine, where accurate decision-making relies heavily on evidence-based practices. While artificial intelligence (AI) has the potential to transform clinical workflows, the intersection of AI and evidence-based medicine remains under-explored, with primitive attempts restricted to text-only general medicine. In this work, we present PathPocket, a multimodal AI agentic co-pilot designed specifically for evidence grounded pathology. We construct the most comprehensive pathology evidence corpus to date, encompassing approximately 110,472 public and authorized documents structured across a rigorous hierarchy of evidence from clinical guideline to expert opinion. From this meticulously graded foundation, we build a large-scale multimodal pathology hypergraph containing over 4.55 million entities and 7.10 million relations. Serving as a robust knowledge engine, this hypergraph provides traceable evidence for a collaborative multi-agent reasoning framework integrating input understanding, evidence retrieval, filtering, and diagnosis generation. This enables PathPocket to seamlessly resolve a wide spectrum of clinical tasks, ranging from text-only queries to complex multimodal diagnostics involving region-of-interest (ROI) and gigapixel whole-slide images (WSIs). We rigorously evaluate the system on a multidimensional benchmark of over 200,000 real-world cases, where it significantly outperforms existing state-of-the-arts. Crucially, extensive user studies demonstrate that PathPocket substantially improves the diagnostic accuracy and confidence of pathologists. By directly grounding pathology interpretations in verifiable literature, PathPocket offers a practical and scalable solution for the future of evidence grounded computational pathology.
Problem

Research questions and friction points this paper is trying to address.

evidence-based medicine
computational pathology
multimodal AI
whole-slide images
clinical decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal agentic co-pilot
evidence-grounded pathology
pathology hypergraph
whole-slide image (WSI) analysis
multi-agent reasoning
πŸ”Ž Similar Papers
No similar papers found.
Z
Zhe Xu
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China
Z
Zhengyu Zhang
Department of Pathology, Nanfang Hospital, Southern Medical University, Guangzhou, China; Department of Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
Z
Zhiyuan Cai
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China
Jiahao Xu
Jiahao Xu
Nanyang Technological University
LLM Efficient ReasoningNMTAudio TranslationSentence Embeddings
Yijie Lin
Yijie Lin
Department of Information Engineering and Computer Science, Feng Chia University
SteganographyInformation SecurityImage ProcessingArtificial Intelligence
Ziyi Liu
Ziyi Liu
Hong Kong University of Science and Technology
SpatialTemporalGraphDatabase
Junlin Hou
Junlin Hou
HKUST | Fudan University
Computer VisionMedical Image AnalysisLabel-efficient Deep LearningeXplainable AI
H
Hongyi Wang
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China
Yuxiang Nie
Yuxiang Nie
Hong Kong University of Science and Technology
Natural language processingMulti-modal LearningMedical Image Analysis
Ling Liang
Ling Liang
pku.edu.cn
Yihui Wang
Yihui Wang
PhD student in CSE, HKUST
Computer VisionMedical Image AnalysisComputational Pathology
Yingxue Xu
Yingxue Xu
The Hong Kong University of Science and Technology
Multimodal LearningSurvival AnalysisComputational Pathology
R
Ronald Cheong Kin Chan
Department of Anatomical and Cellular Pathology, Chinese University of Hong Kong, Hong Kong, China
Li Liang
Li Liang
The University of Western Australia
3D Point Cloud Processing3D Semantic Scene Completion3D Semantic Scene Generation
Hao Chen
Hao Chen
Assistant Professor, The Hong Kong University of Science and Technology
Large ModelComputational PathologyMedical Image AnalysisMultimodalAI for Science