NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Existing large language models (LLMs) struggle to accurately model plot progression, character relationships, and thematic coherence in long-form narrative texts (e.g., novels, films, TV series). To address this, we propose a fine-tuning-free hierarchical multi-agent LLM framework. Our approach introduces a novel “dialogue-to-description” preprocessing mechanism, integrated with narrative-driven text normalization, hierarchical summarization scheduling, and length-controllable generation. By orchestrating multiple LLMs, the framework enables chunk-level optimization and cross-granularity summary synthesis. Evaluated on book, film, and television series datasets, our method achieves a +30.0% improvement in BERTScore (F1) over state-of-the-art methods, substantially advancing the performance ceiling for narrative summarization. Moreover, it demonstrates strong cross-domain generalization capability.

Technology Category

Application Category

📝 Abstract

Summarizing long-form narratives--such as books, movies, and TV scripts--requires capturing intricate plotlines, character interactions, and thematic coherence, a task that remains challenging for existing LLMs. We introduce NexusSum, a multi-agent LLM framework for narrative summarization that processes long-form text through a structured, sequential pipeline--without requiring fine-tuning. Our approach introduces two key innovations: (1) Dialogue-to-Description Transformation: A narrative-specific preprocessing method that standardizes character dialogue and descriptive text into a unified format, improving coherence. (2) Hierarchical Multi-LLM Summarization: A structured summarization pipeline that optimizes chunk processing and controls output length for accurate, high-quality summaries. Our method establishes a new state-of-the-art in narrative summarization, achieving up to a 30.0% improvement in BERTScore (F1) across books, movies, and TV scripts. These results demonstrate the effectiveness of multi-agent LLMs in handling long-form content, offering a scalable approach for structured summarization in diverse storytelling domains.

Problem

Research questions and friction points this paper is trying to address.

Summarizing long-form narratives with complex plots and characters

Standardizing dialogue and descriptive text for better coherence

Optimizing chunk processing for accurate, high-quality summaries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical multi-agent LLM framework

Dialogue-to-description transformation preprocessing

Structured summarization pipeline optimization

🔎 Similar Papers

LaMSUM: Amplifying Voices Against Harassment through LLM Guided Extractive Summarization of User Incident Reports