Speaker Recognition -- Wavelet Packet Based Multiresolution Feature Extraction Approach

📅 2025-12-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient feature discriminability and noise robustness in text-independent speaker recognition, this paper proposes a deep coupling framework integrating Mel-frequency cepstral coefficients (MFCCs) and wavelet packet transform (WPT) for multi-resolution feature extraction. Unlike prior approaches, it explicitly unifies MFCC’s auditory perception modeling with WPT’s multi-scale time-frequency decomposition into a single, coherent feature representation, jointly optimized for both speaker identification and verification. Evaluated on VoxForge and CSTR US KED TIMIT datasets, the method achieves significant accuracy improvements. Moreover, it demonstrates superior robustness across diverse signal-to-noise ratio (SNR) conditions, confirming strong noise resilience and generalization capability. The core contribution lies in the synergistic MFCC–WPT modeling mechanism—a novel paradigm for speech feature design that bridges perceptual and mathematical signal representations.

Technology Category

Application Category

📝 Abstract
This paper proposes a novel Wavelet Packet based feature extraction approach for the task of text independent speaker recognition. The features are extracted by using the combination of Mel Frequency Cepstral Coefficient (MFCC) and Wavelet Packet Transform (WPT).Hybrid Features technique uses the advantage of human ear simulation offered by MFCC combining it with multi-resolution property and noise robustness of WPT. To check the validity of the proposed approach for the text independent speaker identification and verification we have used the Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) respectively as the classifiers. The proposed paradigm is tested on voxforge speech corpus and CSTR US KED Timit database. The paradigm is also evaluated after adding standard noise signal at different level of SNRs for evaluating the noise robustness. Experimental results show that better results are achieved for the tasks of both speaker identification as well as speaker verification.
Problem

Research questions and friction points this paper is trying to address.

Proposes wavelet packet features for speaker recognition
Combines MFCC and WPT for noise robustness
Tests approach with GMM and HMM classifiers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines MFCC and wavelet packet transform features
Uses GMM and HMM classifiers for speaker tasks
Evaluates noise robustness with SNR-based testing