Sounding Like a Winner? Prosodic Differences in Post-Match Interviews

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether prosodic features extracted from post-match tennis interview speech can reliably indicate match outcomes. We propose a classification framework that integrates conventional acoustic features—such as pitch variability and intensity dynamics—with self-supervised speech representations (Wav2Vec 2.0 and HuBERT), operating solely on raw audio to predict win/loss status. Experimental results demonstrate that prosodic cues, particularly pitch variability, exhibit significant statistical association with victory-related affective states. Moreover, self-supervised representations consistently outperform handcrafted features in both cross-sample generalization and discriminative accuracy, achieving >85% classification accuracy across multiple independent datasets. This work constitutes the first systematic validation that victory- and defeat-related emotional states embedded in post-competition speech are computationally identifiable. It establishes a novel, audio-only paradigm for inferring competitive outcomes through prosodic analysis, advancing the intersection of affective computing, sports analytics, and spoken language processing.

Technology Category

Application Category

📝 Abstract
This study examines the prosodic characteristics associated with winning and losing in post-match tennis interviews. Additionally, this research explores the potential to classify match outcomes solely based on post-match interview recordings using prosodic features and self-supervised learning (SSL) representations. By analyzing prosodic elements such as pitch and intensity, alongside SSL models like Wav2Vec 2.0 and HuBERT, the aim is to determine whether an athlete has won or lost their match. Traditional acoustic features and deep speech representations are extracted from the data, and machine learning classifiers are employed to distinguish between winning and losing players. Results indicate that SSL representations effectively differentiate between winning and losing outcomes, capturing subtle speech patterns linked to emotional states. At the same time, prosodic cues -- such as pitch variability -- remain strong indicators of victory.
Problem

Research questions and friction points this paper is trying to address.

Analyze prosodic traits in tennis post-match interviews
Classify match outcomes using speech features and SSL models
Identify pitch and intensity as victory indicators
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses prosodic features for outcome classification
Employs self-supervised learning models like Wav2Vec
Combines pitch variability with deep speech analysis
🔎 Similar Papers
No similar papers found.
Sofoklis Kakouros
Sofoklis Kakouros
Docent, University of Helsinki
speech processingprosodyphoneticscognitive sciencemachine learning
H
Haoyu Chen
Center for Machine Vision and Signal Analysis, University of Oulu, Finland