Self-supervised Learning Method Using Transformer for Multi-dimensional Sensor Data Processing

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

For human activity recognition (HAR) from multidimensional sensor time-series data, this paper proposes a self-supervised Transformer architecture tailored for numerical signals. To address the challenge of effectively modeling raw sensor streams without abundant labeled data, the method introduces two key components: (1) an n-dimensional linear embedding combined with numerical binning to convert continuous sensor measurements into linguistically structured token sequences; and (2) a lightweight output head integrated within a self-supervised pretraining framework, eliminating reliance on large-scale annotated datasets. Evaluated on five mainstream HAR benchmarks, the approach achieves 10–15% higher accuracy than standard Transformers and demonstrates significantly improved cross-device generalization. By enabling effective representation learning directly from unlabeled numerical time series, it establishes a scalable, low-resource paradigm for time-series perception modeling.

Technology Category

Application Category

📝 Abstract

We developed a deep learning algorithm for human activity recognition using sensor signals as input. In this study, we built a pretrained language model based on the Transformer architecture, which is widely used in natural language processing. By leveraging this pretrained model, we aimed to improve performance on the downstream task of human activity recognition. While this task can be addressed using a vanilla Transformer, we propose an enhanced n-dimensional numerical processing Transformer that incorporates three key features: embedding n-dimensional numerical data through a linear layer, binning-based pre-processing, and a linear transformation in the output layer. We evaluated the effectiveness of our proposed model across five different datasets. Compared to the vanilla Transformer, our model demonstrated 10%-15% improvements in accuracy.

Problem

Research questions and friction points this paper is trying to address.

Develop self-supervised learning for sensor data processing

Enhance Transformer for multi-dimensional numerical data

Improve human activity recognition accuracy by 10%-15%

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based self-supervised learning for sensor data

Enhanced n-dimensional numerical processing Transformer

Linear layer embedding and binning-based pre-processing

🔎 Similar Papers

No similar papers found.

Authors to Follow