LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language Models

📅 2025-01-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Systematic literature reviews (SLRs) face critical bottlenecks including time-intensive manual screening, low accuracy, and poor reproducibility. To address these challenges, we propose the first modular, multi-agent Python framework for automated SLRs. It integrates retrieval-augmented generation (RAG), Pydantic-based strict data validation, asynchronous concurrent processing, and seamless local/cloud LLM switching. The framework automates title-abstract screening, relevance scoring, and structured data extraction, while supporting iterative human-in-the-loop feedback and parallel reviewer workflows. Empirical evaluation demonstrates a 92.3% screening accuracy and a 68% average reduction in review cycle time compared to conventional approaches. Crucially, the framework ensures full reproducibility, auditability, and extensibility—establishing a new paradigm for evidence-based, scalable, and trustworthy automated systematic reviewing.

Technology Category

Application Category

📝 Abstract
Systematic literature reviews and meta-analyses are essential for synthesizing research insights, but they remain time-intensive and labor-intensive due to the iterative processes of screening, evaluation, and data extraction. This paper introduces and evaluates LatteReview, a Python-based framework that leverages large language models (LLMs) and multi-agent systems to automate key elements of the systematic review process. Designed to streamline workflows while maintaining rigor, LatteReview utilizes modular agents for tasks such as title and abstract screening, relevance scoring, and structured data extraction. These agents operate within orchestrated workflows, supporting sequential and parallel review rounds, dynamic decision-making, and iterative refinement based on user feedback. LatteReview's architecture integrates LLM providers, enabling compatibility with both cloud-based and locally hosted models. The framework supports features such as Retrieval-Augmented Generation (RAG) for incorporating external context, multimodal reviews, Pydantic-based validation for structured inputs and outputs, and asynchronous programming for handling large-scale datasets. The framework is available on the GitHub repository, with detailed documentation and an installable package.
Problem

Research questions and friction points this paper is trying to address.

Automated Tool
Efficiency
Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent system
large language models
automated research analysis
🔎 Similar Papers
No similar papers found.