From Speech to Summary: A Comprehensive Survey of Speech Summarization

📅 2025-04-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Speech summarization lacks a well-defined task formulation, clear domain boundaries, and systematic survey coverage. Method: This work formally defines the task scope, clarifies its distinctions from and relationships to automatic speech recognition (ASR) and text summarization, traces its technical evolution—from ASR post-processing and cascaded fine-tuning to end-to-end joint modeling—and establishes a cross-task evaluation framework integrating metrics (e.g., ROUGE, BERTScore) and benchmarks (e.g., AMI, ICSI, SummSpeech). Contribution/Results: It presents the first comprehensive, full-stack survey of speech summarization, unifying definitions, methodologies, evaluation protocols, and datasets. This framework provides a principled foundation for algorithm design, benchmark development, and real-world deployment.

Technology Category

Application Category

📝 Abstract

Speech summarization has become an essential tool for efficiently managing and accessing the growing volume of spoken and audiovisual content. However, despite its increasing importance, speech summarization is still not clearly defined and intersects with several research areas, including speech recognition, text summarization, and specific applications like meeting summarization. This survey not only examines existing datasets and evaluation methodologies, which are crucial for assessing the effectiveness of summarization approaches but also synthesizes recent developments in the field, highlighting the shift from traditional systems to advanced models like fine-tuned cascaded architectures and end-to-end solutions.

Problem

Research questions and friction points this paper is trying to address.

Defining unclear boundaries of speech summarization research

Evaluating datasets and methods for speech summarization effectiveness

Exploring advanced models for speech-to-summary conversion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned cascaded architectures for summarization

End-to-end speech summarization solutions

Integration of speech recognition and text summarization

🔎 Similar Papers

No similar papers found.

Authors to Follow