Multi-LLM Text Summarization

📅 2024-12-20
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited quality and robustness of text summarization by individual large language models (LLMs), this paper proposes a multi-LLM collaborative summarization framework based on a generation–evaluation two-stage paradigm: *k* LLMs generate diverse summaries in parallel, followed by dynamic selection and consensus aggregation via either centralized evaluation (a single LLM scoring all candidates) or decentralized evaluation (*k* LLMs performing cross-evaluation). This is the first systematic study comparing these two multi-model coordination mechanisms for summarization, innovatively integrating prompt engineering, cross-validation, and result optimization techniques. Experiments demonstrate that the proposed approach achieves, on average, a threefold improvement over single-LLM baselines across standard metrics—including ROUGE and BERTScore—significantly enhancing summary accuracy, consistency, and generalization. The results validate both the effectiveness and scalability of the multi-LLM collaborative paradigm.

Technology Category

Application Category

📝 Abstract
In this work, we propose a Multi-LLM summarization framework, and investigate two different multi-LLM strategies including centralized and decentralized. Our multi-LLM summarization framework has two fundamentally important steps at each round of conversation: generation and evaluation. These steps are different depending on whether our multi-LLM decentralized summarization is used or centralized. In both our multi-LLM decentralized and centralized strategies, we have k different LLMs that generate diverse summaries of the text. However, during evaluation, our multi-LLM centralized summarization approach leverages a single LLM to evaluate the summaries and select the best one whereas k LLMs are used for decentralized multi-LLM summarization. Overall, we find that our multi-LLM summarization approaches significantly outperform the baselines that leverage only a single LLM by up to 3x. These results indicate the effectiveness of multi-LLM approaches for summarization.
Problem

Research questions and friction points this paper is trying to address.

Compares centralized vs decentralized multi-LLM summarization strategies
Evaluates performance of k LLMs generating diverse summaries
Demonstrates multi-LLM approaches outperform single-LLM by 3x
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-LLM framework with centralized and decentralized strategies
Diverse summaries generated by k different LLMs
Evaluation by single or multiple LLMs for selection
J
Jiangnan Fang
University of California, Santa Cruz
C
Cheng-Tse Liu
University of California, Santa Cruz
Jieun Kim
Jieun Kim
Associate professor, Hanyang University
UI/UX designInclusive DesignHuman computer interaction
Y
Yash Bhedaru
University of California, Santa Cruz
E
Ethan Liu
University of California, Santa Cruz
N
Nikhil Singh
University of California, Santa Cruz
Nedim Lipka
Nedim Lipka
Adobe Systems Inc
Big Data AnalyticsMachine LearningWeb MiningOnline Advertisement
P
Puneet Mathur
Adobe Research
Nesreen K. Ahmed
Nesreen K. Ahmed
Senior Principal Scientist, Cisco AI Research, Intel Labs, Purdue University
Geometric Deep LearningGraph Representation LearningML for SystemsML4code
Franck Dernoncourt
Franck Dernoncourt
NLP/ML Researcher. MIT PhD.
Machine LearningNeural NetworksNatural Language Processing
R
Ryan Rossi
Adobe Research
H
Hanieh Deilamsalehy
Adobe Research