From Speech to Summary: A Comprehensive Survey of Speech Summarization

📅 2025-04-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Speech summarization lacks a well-defined task formulation, clear domain boundaries, and systematic survey coverage. Method: This work formally defines the task scope, clarifies its distinctions from and relationships to automatic speech recognition (ASR) and text summarization, traces its technical evolution—from ASR post-processing and cascaded fine-tuning to end-to-end joint modeling—and establishes a cross-task evaluation framework integrating metrics (e.g., ROUGE, BERTScore) and benchmarks (e.g., AMI, ICSI, SummSpeech). Contribution/Results: It presents the first comprehensive, full-stack survey of speech summarization, unifying definitions, methodologies, evaluation protocols, and datasets. This framework provides a principled foundation for algorithm design, benchmark development, and real-world deployment.

Technology Category

Application Category

📝 Abstract
Speech summarization has become an essential tool for efficiently managing and accessing the growing volume of spoken and audiovisual content. However, despite its increasing importance, speech summarization is still not clearly defined and intersects with several research areas, including speech recognition, text summarization, and specific applications like meeting summarization. This survey not only examines existing datasets and evaluation methodologies, which are crucial for assessing the effectiveness of summarization approaches but also synthesizes recent developments in the field, highlighting the shift from traditional systems to advanced models like fine-tuned cascaded architectures and end-to-end solutions.
Problem

Research questions and friction points this paper is trying to address.

Defining unclear boundaries of speech summarization research
Evaluating datasets and methods for speech summarization effectiveness
Exploring advanced models for speech-to-summary conversion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned cascaded architectures for summarization
End-to-end speech summarization solutions
Integration of speech recognition and text summarization
🔎 Similar Papers
No similar papers found.