SegReConcat: A Data Augmentation Method for Voice Anonymization Attack

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Speech anonymization often retains residual speaker cues, posing privacy risks. To address insufficient de-anonymization attack capability, this paper proposes SegReConcat: a data augmentation method that segments speech at the word level, applies random or semantic-similarity-driven shuffling, and concatenates the reconstructed sequence with the original utterance—thereby enhancing the attacker’s ability to model long-term speaker identity cues. This paradigm strengthens speaker特征 learning from multiple perspectives without modifying either the anonymization system or the attacker’s architecture. Evaluated on the VoicePrivacy 2024 benchmark, SegReConcat significantly improves de-anonymization performance across five of seven state-of-the-art anonymization methods, demonstrating its effectiveness and generalizability. The approach offers a novel perspective for evaluating speech privacy robustness.

Technology Category

Application Category

📝 Abstract
Anonymization of voice seeks to conceal the identity of the speaker while maintaining the utility of speech data. However, residual speaker cues often persist, which pose privacy risks. We propose SegReConcat, a data augmentation method for attacker-side enhancement of automatic speaker verification systems. SegReConcat segments anonymized speech at the word level, rearranges segments using random or similarity-based strategies to disrupt long-term contextual cues, and concatenates them with the original utterance, allowing an attacker to learn source speaker traits from multiple perspectives. The proposed method has been evaluated in the VoicePrivacy Attacker Challenge 2024 framework across seven anonymization systems, SegReConcat improves de-anonymization on five out of seven systems.
Problem

Research questions and friction points this paper is trying to address.

Disrupts residual speaker cues in anonymized speech
Enhances attacker-side speaker verification systems
Segments and rearranges speech to obscure identity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Segments anonymized speech at word level
Rearranges segments using random or similarity strategies
Concatenates segments with original utterance for multi-perspective learning
🔎 Similar Papers
No similar papers found.