๐ค AI Summary
To address overfitting and poor generalization in deep learning-based sEMG gesture recognition caused by scarce training data, this paper proposes a sparse-aware, semantic-guided diffusion augmentation method. The approach incorporates fine-grained semantic representations as conditional inputs, employs Gaussian modeling of the semantic space for controllable sampling, and introduces a sparse-aware strategy to actively explore low-density regionsโthereby enhancing the diversity, fidelity, and practical utility of generated samples. Compared with conventional augmentation techniques, our method significantly mitigates overfitting on the Ninapro DB2, DB4, and DB7 datasets, achieving average accuracy improvements of 3.2โ5.7% and superior cross-subject and cross-session generalization. Key innovations include: (1) a semantic-guided diffusion modeling framework; (2) a Gaussian-based semantic sampling mechanism; and (3) a sparsity-driven data coverage optimization strategy.
๐ Abstract
Surface electromyography (sEMG)-based gesture recognition plays a critical role in human-machine interaction (HMI), particularly for rehabilitation and prosthetic control. However, sEMG-based systems often suffer from the scarcity of informative training data, leading to overfitting and poor generalization in deep learning models. Data augmentation offers a promising approach to increasing the size and diversity of training data, where faithfulness and diversity are two critical factors to effectiveness. However, promoting untargeted diversity can result in redundant samples with limited utility. To address these challenges, we propose a novel diffusion-based data augmentation approach, Sparse-Aware Semantic-Guided Diffusion Augmentation (SASG-DA). To enhance generation faithfulness, we introduce the Semantic Representation Guidance (SRG) mechanism by leveraging fine-grained, task-aware semantic representations as generation conditions. To enable flexible and diverse sample generation, we propose a Gaussian Modeling Semantic Sampling (GMSS) strategy, which models the semantic representation distribution and allows stochastic sampling to produce both faithful and diverse samples. To enhance targeted diversity, we further introduce a Sparse-Aware Semantic Sampling (SASS) strategy to explicitly explore underrepresented regions, improving distribution coverage and sample utility. Extensive experiments on benchmark sEMG datasets, Ninapro DB2, DB4, and DB7, demonstrate that SASG-DA significantly outperforms existing augmentation methods. Overall, our proposed data augmentation approach effectively mitigates overfitting and improves recognition performance and generalization by offering both faithful and diverse samples.