Chain of Summaries: Summarization Through Iterative Questioning

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Large language models (LLMs) struggle to efficiently process raw web content due to its unstructured formatting and contextual constraints. Method: This paper proposes a chain-of-summarization approach grounded in the dialectical “thesis–antithesis–synthesis” framework, iteratively refining summaries through multi-turn questioning and abstraction to produce concise, information-dense plain-text summaries—without task-specific fine-tuning. The method jointly addresses both explicit and implicit information needs, enhancing interpretability and downstream utility. Results: Experiments on TriviaQA, TruthfulQA, and SQuAD demonstrate that our approach outperforms zero-shot LLM baselines by 66% and surpasses dedicated abstractive summarizers (e.g., BRIO, PEGASUS) by 27%, while achieving higher question-answering accuracy using fewer tokens.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are increasingly using external web content. However, much of this content is not easily digestible by LLMs due to LLM-unfriendly formats and limitations of context length. To address this issue, we propose a method for generating general-purpose, information-dense summaries that act as plain-text repositories of web content. Inspired by Hegel's dialectical method, our approach, denoted as Chain of Summaries (CoS), iteratively refines an initial summary (thesis) by identifying its limitations through questioning (antithesis), leading to a general-purpose summary (synthesis) that can satisfy current and anticipate future information needs. Experiments on the TriviaQA, TruthfulQA, and SQUAD datasets demonstrate that CoS outperforms zero-shot LLM baselines by up to 66% and specialized summarization methods such as BRIO and PEGASUS by up to 27%. CoS-generated summaries yield higher Q&A performance compared to the source content, while requiring substantially fewer tokens and being agnostic to the specific downstream LLM. CoS thus resembles an appealing option for website maintainers to make their content more accessible for LLMs, while retaining possibilities for human oversight.

Problem

Research questions and friction points this paper is trying to address.

Generating digestible summaries for web content

Overcoming LLM context length and format limitations

Creating general-purpose summaries for future information needs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative refinement of summaries through questioning

Generating general-purpose information-dense plain-text repositories

Anticipating future information needs via dialectical synthesis

🔎 Similar Papers

No similar papers found.