🤖 AI Summary
This paper addresses the long-standing performance stagnation in wearable sensor-based human activity recognition (HAR). Methodologically, it introduces a novel paradigm that integrates foundational model world knowledge into HAR, unifying time-series signal processing, self-supervised pretraining, multimodal fusion, and prompt-based fine-tuning within a single end-to-end framework—designed to support both novice users and domain experts. The contributions are threefold: (1) it identifies and analyzes the root causes of performance saturation on mainstream HAR benchmarks; (2) it empirically validates the proposed paradigm across multiple public datasets, demonstrating significant improvements in classification accuracy and cross-dataset generalization; and (3) it releases comprehensive open-source tutorials and a methodological survey, substantially lowering the barrier to practical HAR deployment.
📝 Abstract
In the many years since the inception of wearable sensor-based Human Activity Recognition (HAR), a wide variety of methods have been introduced and evaluated for their ability to recognize activities. Substantial gains have been made since the days of hand-crafting heuristics as features, yet, progress has seemingly stalled on many popular benchmarks, with performance falling short of what may be considered 'sufficient'-- despite the increase in computational power and scale of sensor data, as well as rising complexity in techniques being employed. The HAR community approaches a new paradigm shift, this time incorporating world knowledge from foundational models. In this paper, we take stock of sensor-based HAR -- surveying it from its beginnings to the current state of the field, and charting its future. This is accompanied by a hands-on tutorial, through which we guide practitioners in developing HAR systems for real-world application scenarios. We provide a compendium for novices and experts alike, of methods that aim at finally solving the activity recognition problem.