StutterCut: Uncertainty-Guided Normalised Cut for Dysfluency Segmentation

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Existing stutter detection methods operate solely at the utterance level, limiting their utility for precise speech therapy and real-time intervention. This work proposes StutterCut, the first framework to formulate stutter segment segmentation as a graph partitioning problem: speech embeddings from overlapping windows serve as nodes, while edge weights are optimized via Normalized Cut. We further introduce an uncertainty-aware pseudo-oracle classifier based on Monte Carlo Dropout to dynamically weight pseudo-labels. Additionally, we extend the FluencyBank dataset with frame-level boundary annotations for four non-fluency types. Under weak supervision—using only utterance-level labels—StutterCut achieves strong segmentation performance. Experiments on both real and synthetic data demonstrate significant F1-score improvements over prior methods; moreover, stutter onset detection accuracy and robustness reach new state-of-the-art levels.

Technology Category

Application Category

📝 Abstract

Detecting and segmenting dysfluencies is crucial for effective speech therapy and real-time feedback. However, most methods only classify dysfluencies at the utterance level. We introduce StutterCut, a semi-supervised framework that formulates dysfluency segmentation as a graph partitioning problem, where speech embeddings from overlapping windows are represented as graph nodes. We refine the connections between nodes using a pseudo-oracle classifier trained on weak (utterance-level) labels, with its influence controlled by an uncertainty measure from Monte Carlo dropout. Additionally, we extend the weakly labelled FluencyBank dataset by incorporating frame-level dysfluency boundaries for four dysfluency types. This provides a more realistic benchmark compared to synthetic datasets. Experiments on real and synthetic datasets show that StutterCut outperforms existing methods, achieving higher F1 scores and more precise stuttering onset detection.

Problem

Research questions and friction points this paper is trying to address.

Detect and segment speech dysfluencies precisely

Improve dysfluency segmentation using graph partitioning

Enhance dataset with frame-level dysfluency boundaries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-supervised graph partitioning for dysfluency segmentation

Uncertainty-guided pseudo-oracle classifier refinement

Extended FluencyBank with frame-level dysfluency boundaries

🔎 Similar Papers

No similar papers found.