Reference-free Adversarial Sex Obfuscation in Speech

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

Speech gender conversion risks speaker privacy leakage, particularly in reference-free settings where gender-specific acoustic cues persist in the output. To address this, we propose a reference-free adversarial gender obfuscation framework: a gender-conditioned adversarial learning architecture jointly disentangles phonetic content from gender-related representations, while explicit regularization aligns fundamental frequency distributions and formant trajectories to learn gender-neutral acoustic embeddings from balanced training data. Crucially, our method eliminates gender cues without requiring target-speaker references, preserving speech intelligibility and naturalness. Experiments under a semi-informed attack model demonstrate that our approach significantly outperforms existing methods—reducing gender identification accuracy by over 40%—while achieving a Mean Opinion Score (MOS) of 4.1 for speech quality. This work thus achieves a strong trade-off between rigorous privacy protection and high-fidelity speech reconstruction.

Technology Category

Application Category

📝 Abstract

Sex conversion in speech involves privacy risks from data collection and often leaves residual sex-specific cues in outputs, even when target speaker references are unavailable. We introduce RASO for Reference-free Adversarial Sex Obfuscation. Innovations include a sex-conditional adversarial learning framework to disentangle linguistic content from sex-related acoustic markers and explicit regularisation to align fundamental frequency distributions and formant trajectories with sex-neutral characteristics learned from sex-balanced training data. RASO preserves linguistic content and, even when assessed under a semi-informed attack model, it significantly outperforms a competing approach to sex obfuscation.

Problem

Research questions and friction points this paper is trying to address.

Eliminate residual sex-specific cues in speech conversion

Disentangle linguistic content from sex-related acoustic markers

Align speech features with sex-neutral characteristics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sex-conditional adversarial learning framework

Explicit regularization for frequency alignment

Sex-neutral characteristics from balanced data

🔎 Similar Papers

No similar papers found.