π€ AI Summary
To address performance degradation in time-series classification (TSC) caused by inter-class similarity, multimodal class distributions, and noise, this paper proposes CDNetβa novel framework integrating contrastive learning with diffusion models. Its core innovation is a cross-sample, cross-class contrastive diffusion mechanism: leveraging reverse diffusion to implicitly generate discriminative positive and negative samples, thereby achieving implicit denoising and comprehensive mode coverage. Additionally, an uncertainty-weighted composite loss function is introduced to enhance training robustness. Built upon a lightweight CNN backbone, CDNet achieves significant improvements over state-of-the-art methods on the UCR Archive and synthetic multimodal/noisy datasets. It demonstrates exceptional performance under high inter-class similarity, strong noise contamination, and multimodal settings, validating its effectiveness and generalizability in modeling complex time-series distributions.
π Abstract
Deep learning models are widely used for time series classification (TSC) due to their scalability and efficiency. However, their performance degrades under challenging data conditions such as class similarity, multimodal distributions, and noise. To address these limitations, we propose CDNet, a Contrastive Diffusion-based Network that enhances existing classifiers by generating informative positive and negative samples via a learned diffusion process. Unlike traditional diffusion models that denoise individual samples, CDNet learns transitions between samples--both within and across classes--through convolutional approximations of reverse diffusion steps. We introduce a theoretically grounded CNN-based mechanism to enable both denoising and mode coverage, and incorporate an uncertainty-weighted composite loss for robust training. Extensive experiments on the UCR Archive and simulated datasets demonstrate that CDNet significantly improves state-of-the-art (SOTA) deep learning classifiers, particularly under noisy, similar, and multimodal conditions.