🤖 AI Summary
This work addresses the significant performance degradation of lightweight SensorLLM in recognizing static human postures—such as standing, sitting, and lying—during human activity recognition. To overcome this limitation without requiring large-scale pretraining, the authors propose a lightweight post-alignment adaptation mechanism. This approach introduces a gravity-aware hierarchical routing head that leverages channel-wise statistical features from the Chronos tokenizer to softly route inputs between specialized static and dynamic expert branches. A load-balancing loss further encourages differentiated modeling across these branches. Evaluated on the MHealth dataset, the method substantially improves macro-F1 scores, particularly enhancing recognition accuracy for static activities while maintaining high performance on dynamic ones, all with minimal additional parameter overhead.
📝 Abstract
Recent studies on sensor-language alignment have shown that two-stage frameworks can improve the semantic modeling ability of wearable-sensor human activity recognition (HAR), where SensorLLM-style methods first perform motion-to-language alignment and then fine-tune the model for downstream tasks. However, our experiments reveal a consistent failure mode when the Stage 2 backbone is compressed to a compact model such as TinyLlama: recognition of dynamic activities remains relatively strong, while the discrimination of low-motion static classes such as standing, sitting, and lying degrades substantially. To address this issue, we propose a gravity-aware hierarchical routing head as a lightweight post-alignment adaptation built on top of an already aligned model, rather than a new large-scale pretraining framework. The method uses the per-channel mean and std from the Chronos tokenizer state to extract statistical cues related to posture and gravity direction, and adaptively combines a static expert and a full expert through soft routing, together with a load-balancing loss for stable training. On the MHealth dataset, this design significantly improves macro-F1 with minimal parameter overhead, and the gains are concentrated mainly on static classes while preserving strong performance on dynamic activities. As a first arXiv disclosure, the current paper reports results on a single dataset only, with the goal of highlighting the core method and laying the groundwork for broader evaluation in future work.