🤖 AI Summary
This work addresses the challenges of sliding-window mixing and boundary contamination in smart home activity recognition, which arise due to unknown activity boundaries. To tackle these issues, the authors propose a trajectory-centric real-time activity recognition framework that maps sensor events onto a floorplan to generate layout-aligned trajectory image sequences. The approach incorporates a boundary-guided mask and a lightweight contaminated-window detection mechanism, coupled with a gating strategy that emphasizes recent activity while suppressing outdated context. By innovatively integrating spatial layout information with temporal dynamics, the method preserves environmental structure and enhances robustness against mixed-activity windows. Experimental results on four public datasets demonstrate superior performance on pure activity segments and significant improvements in Macro-F1 scores under mixed or overlapping window scenarios, confirming its effectiveness under near-realistic conditions.
📝 Abstract
Human Activity Recognition (HAR) from ambient sensors enables smart-home applications such as health monitoring and assisted living. In realistic deployments, however, sensor events arrive as a continuous stream and activity boundaries are unknown. Sliding-window inference therefore produces many windows that straddle transitions and contain mixed activities, creating boundary contamination that violates the pre-segmented instance assumption used by most benchmarks and models. Moreover, many pipelines under-use spatial context by treating sensor IDs as independent tokens. We present LastAct, a trajectory-centric framework for streaming smart-home HAR that targets the most recent activity under mixed windows while explicitly modeling spatial structure. LastAct projects sensor events onto the home floorplan to form a layout-aligned trajectory image sequence that preserves spatial continuity. A lightweight gate identifies contaminated windows, and a boundary localizer estimates the last transition to enable boundary-guided masking that emphasizes post-boundary evidence and suppresses stale context. For efficiency, we reuse a precomputed layout-aligned template cache to avoid repeated rendering. Empirically, across four public smart-home datasets under near-realistic mixed-activity protocols, LastAct achieves competitive or superior performance on pure windows and yields substantial Macro-F1 gains on cross/mixed windows, demonstrating improved robustness under near-realistic sliding-window regimes.