Extending AI for Research to the Humanities: A Multi-Agent Framework for Evidence-Grounded Scholarship

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the limitations of existing AI research agents, which predominantly focus on scientific and engineering domains and struggle to support the interpretive, evidence-driven reasoning required in the humanities. To bridge this gap, the authors propose SPIRE, a multi-agent framework that uniquely integrates Scholarly Primitives theory into its architecture. SPIRE models collaborative scholarly roles to perform discovery, annotation, comparison, provenance tracing, and argument synthesis across multi-scale close-reading corpora—including paragraphs, contextual graph communities, and cross-contextual semantic clusters. By incorporating citation anchoring and coordination mechanisms, SPIRE significantly outperforms naive LLMs, Text RAG, and GraphRAG on classical Chinese and Latin humanities benchmarks, achieving state-of-the-art results in evidence recall, blind-review accuracy, argument depth and coverage, and evidential quality, thereby establishing a new paradigm for evidence-rigorous AI-assisted humanities research.

📝 Abstract

LLM-based research agents have advanced rapidly in science and engineering, where research is organized around executable experiments, code, and quantitative signals. Humanities scholarship, however, requires a different mode of reasoning: interpretive, evidence-grounded argument over primary sources, where scholarly value depends on faithful quotation, verifiable provenance, and close reading. Existing research agents remain largely optimized for execution and retrieval, not evidence-grounded interpretive reasoning. To address this gap, we introduce SPIRE (Scholarly-Primitives-Inspired Research Engine), a multi-agent framework for evidence-grounded humanities scholarship. Drawing on Scholarly Primitives theory, SPIRE casts recurring humanities operations as cooperating agent roles (source discovery, evidence annotation, comparison, provenance checking, sampling, citation binding, and argumentative synthesis) over a multi-scale close-reading substrate of passages, intra-context graph communities, and cross-context semantic clusters. On a peer-reviewed-paper benchmark over classical Chinese and Greco-Roman Latin scholarship, SPIRE recovers cited primary-source evidence more reliably than Naive LLM, Text RAG, and GraphRAG, and receives higher blind-judge scores on answer accuracy, depth, coverage, and evidence quality. Ablations show that both the scholarly-operation agents and close-reading retrieval contribute to evidence-grounded essays. Code, data catalogues, and reproduction scripts are released at https://github.com/YatingPan/SPIRE.

Problem

Research questions and friction points this paper is trying to address.

humanities scholarship

evidence-grounded reasoning

interpretive argument

primary sources

scholarly primitives

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent framework

evidence-grounded reasoning

humanities scholarship