🤖 AI Summary
Prior work on human activity recognition (HAR) in medical settings emphasizes model architecture while neglecting critical interactions among activation functions and optimizers. Method: This study systematically investigates the coupling effects of six activation–optimizer combinations (ReLU/Sigmoid/Tanh × SGD/Adam/RMSprop/Adagrad) on HAR performance across six activity classes, using BiLSTM and ConvLSTM architectures evaluated on HMDB51 and UCF101 subsets via cross-dataset experiments. Results: ConvLSTM with Adam or RMSprop achieves 99.00% accuracy with consistent performance across both datasets; BiLSTM attains 98.00% on UCF101 but drops sharply to 60.00% on HMDB51, demonstrating ConvLSTM’s superior robustness to activation–optimizer pairings. This work is the first to uncover the tripartite synergy among architecture, activation function, and optimizer in HAR, establishing a reproducible hyperparameter configuration paradigm for clinical deployment.
📝 Abstract
Human Activity Recognition (HAR) plays a vital role in healthcare, surveillance, and innovative environments, where reliable action recognition supports timely decision-making and automation. Although deep learning-based HAR systems are widely adopted, the impact of Activation Functions (AFs) and Model Optimizers (MOs) on performance has not been sufficiently analyzed, particularly regarding how their combinations influence model behavior in practical scenarios. Most existing studies focus on architecture design, while the interaction between AF and MO choices remains relatively unexplored. In this work, we investigate the effect of three commonly used activation functions (ReLU, Sigmoid, and Tanh) combined with four optimization algorithms (SGD, Adam, RMSprop, and Adagrad) using two recurrent deep learning architectures, namely BiLSTM and ConvLSTM. Experiments are conducted on six medically relevant activity classes selected from the HMDB51 and UCF101 datasets, considering their suitability for healthcare-oriented HAR applications. Our experimental results show that ConvLSTM consistently outperforms BiLSTM across both datasets. ConvLSTM, combined with Adam or RMSprop, achieves an accuracy of up to 99.00%, demonstrating strong spatio-temporal learning capabilities and stable performance. While BiLSTM performs reasonably well on UCF101, with accuracy approaching 98.00%, its performance drops to approximately 60.00% on HMDB51, indicating limited robustness across datasets and weaker sensitivity to AF and MO variations. This study provides practical insights for optimizing HAR systems, particularly for real-world healthcare environments where fast and precise activity detection is critical.