🤖 AI Summary
This study addresses the limited accuracy and poor generalizability of activity intensity classification from wrist-worn accelerometer data. We propose a novel hybrid framework integrating self-supervised deep learning with a Hidden Markov Model (HMM). Specifically, an enhanced 18-layer ResNet-V2 architecture is pre-trained in a self-supervised manner on unlabeled data, followed by HMM-based temporal label smoothing and state-transition modeling. To our knowledge, this is the first work to jointly leverage self-supervised learning and HMMs for activity intensity classification. Evaluated on the CAPTURE-24 dataset, our method achieves macro-F1 = 0.82 and Cohen’s kappa = 0.86—significantly outperforming a random forest baseline (0.77/0.81) and demonstrating robust performance across age and sex subgroups. The framework enhances the accuracy and reliability of physical activity assessment in large-scale epidemiological studies.
📝 Abstract
The use of reliable and accurate human activity recognition (HAR) models on passively collected wrist-accelerometer data is essential in large-scale epidemiological studies that investigate the association between physical activity and health outcomes. While the use of self-supervised learning has generated considerable excitement in improving HAR, it remains unknown the extent to which these models, coupled with hidden Markov models (HMMs), would make a tangible improvement to classification performance, and the effect this may have on the predicted daily activity intensity compositions. Using 151 CAPTURE-24 participants' data, we trained the ActiNet model, a self-supervised, 18-layer, modified ResNet-V2 model, followed by hidden Markov model (HMM) smoothing to classify labels of activity intensity. The performance of this model, evaluated using 5-fold stratified group cross-validation, was then compared to a baseline random forest (RF) + HMM, established in existing literature. Differences in performance and classification outputs were compared with different subgroups of age and sex within the Capture-24 population. The ActiNet model was able to distinguish labels of activity intensity with a mean macro F1 score of 0.82, and mean Cohen's kappa score of 0.86. This exceeded the performance of the RF + HMM, trained and validated on the same dataset, with mean scores of 0.77 and 0.81, respectively. These findings were consistent across subgroups of age and sex. These findings encourage the use of ActiNet for the extraction of activity intensity labels from wrist-accelerometer data in future epidemiological studies.