🤖 AI Summary
Large language models (LLMs) struggle to efficiently process raw web content due to its unstructured formatting and contextual constraints. Method: This paper proposes a chain-of-summarization approach grounded in the dialectical “thesis–antithesis–synthesis” framework, iteratively refining summaries through multi-turn questioning and abstraction to produce concise, information-dense plain-text summaries—without task-specific fine-tuning. The method jointly addresses both explicit and implicit information needs, enhancing interpretability and downstream utility. Results: Experiments on TriviaQA, TruthfulQA, and SQuAD demonstrate that our approach outperforms zero-shot LLM baselines by 66% and surpasses dedicated abstractive summarizers (e.g., BRIO, PEGASUS) by 27%, while achieving higher question-answering accuracy using fewer tokens.
📝 Abstract
Large Language Models (LLMs) are increasingly using external web content. However, much of this content is not easily digestible by LLMs due to LLM-unfriendly formats and limitations of context length. To address this issue, we propose a method for generating general-purpose, information-dense summaries that act as plain-text repositories of web content. Inspired by Hegel's dialectical method, our approach, denoted as Chain of Summaries (CoS), iteratively refines an initial summary (thesis) by identifying its limitations through questioning (antithesis), leading to a general-purpose summary (synthesis) that can satisfy current and anticipate future information needs. Experiments on the TriviaQA, TruthfulQA, and SQUAD datasets demonstrate that CoS outperforms zero-shot LLM baselines by up to 66% and specialized summarization methods such as BRIO and PEGASUS by up to 27%. CoS-generated summaries yield higher Q&A performance compared to the source content, while requiring substantially fewer tokens and being agnostic to the specific downstream LLM. CoS thus resembles an appealing option for website maintainers to make their content more accessible for LLMs, while retaining possibilities for human oversight.