CARE: Contrastive Alignment for ADL Recognition from Event-Triggered Sensor Streams

๐Ÿ“… 2025-10-19
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing approaches for recognizing Activities of Daily Living (ADLs) from event-triggered environmental sensor streams suffer from three key limitations: (1) sequence models are noise-sensitive and lack spatial modeling capability; (2) image-based models compress temporal dynamics and distort sensor topology; and (3) naive fusion strategies fail to exploit cross-modal complementarity. To address these, we propose CARE, an end-to-end framework featuring a novel sequenceโ€“image contrastive alignment mechanism that jointly optimizes representation learning and classification. CARE employs a time-aware sequential encoder and a frequency-sensitive spatial image encoder, coupled with a unified contrastive-classification loss to simultaneously achieve cross-modal alignment and discriminative representation learning. Evaluated on three CASAS benchmark datasets, CARE achieves state-of-the-art performance (Milan: 89.8%, Cairo: 88.9%, Kyoto7: 73.3%) and demonstrates strong robustness to sensor failures and layout variations.

Technology Category

Application Category

๐Ÿ“ Abstract
The recognition of Activities of Daily Living (ADLs) from event-triggered ambient sensors is an essential task in Ambient Assisted Living, yet existing methods remain constrained by representation-level limitations. Sequence-based approaches preserve temporal order of sensor activations but are sensitive to noise and lack spatial awareness, while image-based approaches capture global patterns and implicit spatial correlations but compress fine-grained temporal dynamics and distort sensor layouts. Naive fusion (e.g., feature concatenation) fail to enforce alignment between sequence- and image-based representation views, underutilizing their complementary strengths. We propose Contrastive Alignment for ADL Recognition from Event-Triggered Sensor Streams (CARE), an end-to-end framework that jointly optimizes representation learning via Sequence-Image Contrastive Alignment (SICA) and classification via cross-entropy, ensuring both cross-representation alignment and task-specific discriminability. CARE integrates (i) time-aware, noise-resilient sequence encoding with (ii) spatially-informed and frequency-sensitive image representations, and employs (iii) a joint contrastive-classification objective for end-to-end learning of aligned and discriminative embeddings. Evaluated on three CASAS datasets, CARE achieves state-of-the-art performance (89.8% on Milan, 88.9% on Cairo, and 73.3% on Kyoto7) and demonstrates robustness to sensor malfunctions and layout variability, highlighting its potential for reliable ADL recognition in smart homes.
Problem

Research questions and friction points this paper is trying to address.

Recognizing Activities of Daily Living from event-triggered sensor streams
Addressing limitations in sequence-based and image-based ADL recognition methods
Aligning sequence and image representations to enhance ADL classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive alignment between sequence and image representations
Joint optimization of representation learning and classification
Integration of time-aware encoding and spatial image features
๐Ÿ”Ž Similar Papers
J
Junhao Zhao
Electrical and Computer Engineering, University of Maryland, College Park, Maryland, USA
Z
Zishuai Liu
School of Computing, University of Georgia, Athens, USA
R
Ruili Fang
School of Computing, University of Georgia, Athens, USA
J
Jin Lu
School of Computing, University of Georgia, Athens, USA
Linghan Zhang
Linghan Zhang
Human Interaction Technology, Eindhoven University of Technology, Eindhoven, Netherlands
Fei Dou
Fei Dou
Assistant Professor, University of Georgia
Machine LearningInternet of ThingsUbiquitous Computing