🤖 AI Summary
To address user discomfort, privacy concerns, and degraded performance under low-light or long-range conditions inherent in wearable sensors and video-based human activity recognition (HAR), this paper proposes a contactless HAR framework using 60 GHz frequency-modulated continuous-wave (FMCW) radar. We introduce a novel tensor-based representation that directly feeds raw 3D radar feature maps—comprising range-Doppler, azimuth, and elevation dimensions—into temporal models, preserving structural information lost in conventional image-like preprocessing. A radar-optimized ConvLSTM architecture is designed for end-to-end recognition. Furthermore, we establish the first millimeter-wave radar benchmark dataset featuring seven activities captured from multiple viewpoints. Experiments demonstrate state-of-the-art performance: 90.51% accuracy (F1 = 87.31%) in cross-scenario evaluation and 89.56% accuracy (F1 = 87.15%) in leave-one-subject-out validation—significantly outperforming CNN, LSTM, and SVM baselines. The framework exhibits strong privacy preservation, robustness, and generalization capability.
📝 Abstract
Human Activity Recognition has gained significant attention due to its diverse applications, including ambient assisted living and remote sensing. Wearable sensor-based solutions often suffer from user discomfort and reliability issues, while video-based methods raise privacy concerns and perform poorly in low-light conditions or long ranges. This study introduces a Frequency-Modulated Continuous Wave radar-based framework for human activity recognition, leveraging a 60 GHz radar and multi-dimensional feature maps. Unlike conventional approaches that process feature maps as images, this study feeds multi-dimensional feature maps -- Range-Doppler, Range-Azimuth, and Range-Elevation -- as data vectors directly into the machine learning (SVM, MLP) and deep learning (CNN, LSTM, ConvLSTM) models, preserving the spatial and temporal structures of the data. These features were extracted from a novel dataset with seven activity classes and validated using two different validation approaches. The ConvLSTM model outperformed conventional machine learning and deep learning models, achieving an accuracy of 90.51% and an F1-score of 87.31% on cross-scene validation and an accuracy of 89.56% and an F1-score of 87.15% on leave-one-person-out cross-validation. The results highlight the approach's potential for scalable, non-intrusive, and privacy-preserving activity monitoring in real-world scenarios.