🤖 AI Summary
Existing environmental sensor-based human activity recognition (HAR) in smart homes suffers from limitations in continuous streaming inference, spatial layout modeling, and context-aware temporal modeling. To address these challenges, this paper proposes a layout-aware real-time HAR method. Our approach first maps raw sensor streams onto architectural floor plans to generate image-like sequences encoding spatial trajectories. Second, we introduce a learnable temporal embedding module coupled with an attention-based encoder to explicitly model inter-activity transitions and temporal ambiguity. Third, we design a hybrid architecture integrating CNNs, sequential models, and attention mechanisms to jointly model sensor stream projections and spatiotemporal dynamics. Evaluated on multiple real-world smart home datasets, our method achieves significant improvements in recognition accuracy and inference efficiency—reducing latency by 32%–47%—while demonstrating strong robustness and practical deployability.
📝 Abstract
Ambient sensor-based human activity recognition (HAR) in smart homes remains challenging due to the need for real-time inference, spatially grounded reasoning, and context-aware temporal modeling. Existing approaches often rely on pre-segmented, within-activity data and overlook the physical layout of the environment, limiting their robustness in continuous, real-world deployments. In this paper, we propose MARAuder's Map, a novel framework for real-time activity recognition from raw, unsegmented sensor streams. Our method projects sensor activations onto the physical floorplan to generate trajectory-aware, image-like sequences that capture the spatial flow of human movement. These representations are processed by a hybrid deep learning model that jointly captures spatial structure and temporal dependencies. To enhance temporal awareness, we introduce a learnable time embedding module that encodes contextual cues such as hour-of-day and day-of-week. Additionally, an attention-based encoder selectively focuses on informative segments within each observation window, enabling accurate recognition even under cross-activity transitions and temporal ambiguity. Extensive experiments on multiple real-world smart home datasets demonstrate that our method outperforms strong baselines, offering a practical solution for real-time HAR in ambient sensor environments.