🤖 AI Summary
Clustering massive unlabeled time-series data in IoT remains challenging, as existing methods struggle to jointly model temporal structures and optimize representation learning with clustering. To address this, we propose the Fuzzy Cluster-aware Contrastive Clustering (FCC) framework. FCC introduces a novel three-view temporal augmentation strategy and a cluster-aware dynamic hard negative sampling mechanism; it is the first to embed the soft cluster structure of fuzzy C-means into the contrastive learning objective in real time, enabling clustering-guided adaptive representation learning. The method tightly integrates fuzzy clustering, contrastive learning, multi-view augmentation, and deep representation learning. Evaluated on 40 standard benchmark datasets, FCC consistently outperforms eight state-of-the-art baselines, achieving average improvements of 5.2% in clustering accuracy and 6.8% in normalized mutual information.
📝 Abstract
The rapid growth of unlabeled time series data, driven by the Internet of Things (IoT), poses significant challenges in uncovering underlying patterns. Traditional unsupervised clustering methods often fail to capture the complex nature of time series data. Recent deep learning-based clustering approaches, while effective, struggle with insufficient representation learning and the integration of clustering objectives. To address these issues, we propose a fuzzy cluster-aware contrastive clustering framework (FCACC) that jointly optimizes representation learning and clustering. Our approach introduces a novel three-view data augmentation strategy to enhance feature extraction by leveraging various characteristics of time series data. Additionally, we propose a cluster-aware hard negative sample generation mechanism that dynamically constructs high-quality negative samples using clustering structure information, thereby improving the model's discriminative ability. By leveraging fuzzy clustering, FCACC dynamically generates cluster structures to guide the contrastive learning process, resulting in more accurate clustering. Extensive experiments on 40 benchmark datasets show that FCACC outperforms the selected baseline methods (eight in total), providing an effective solution for unsupervised time series learning.