HRTR: A Single-stage Transformer for Fine-grained Sub-second Action Segmentation in Stroke Rehabilitation

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Addressing the challenge of fine-grained, sub-second (<1 s) action detection in stroke rehabilitation, this paper proposes the High-Resolution Temporal Transformer (HRTR)—the first single-stage, end-to-end framework for temporal action localization and classification. HRTR models millisecond-level temporal dynamics via self-attention, incorporates high-density temporal step embeddings, and jointly optimizes frame-wise classification and boundary regression—eliminating conventional multi-stage pipelines and post-processing. On StrokeRehab Video, StrokeRehab IMU, and 50Salads datasets, HRTR achieves Edit Scores of 70.1, 69.4, and 88.4, respectively, surpassing all state-of-the-art methods. Its core contribution lies in the first direct, single-stage modeling of sub-second action boundaries, significantly improving both temporal precision and inference efficiency.

Technology Category

Application Category

📝 Abstract

Stroke rehabilitation often demands precise tracking of patient movements to monitor progress, with complexities of rehabilitation exercises presenting two critical challenges: fine-grained and sub-second (under one-second) action detection. In this work, we propose the High Resolution Temporal Transformer (HRTR), to time-localize and classify high-resolution (fine-grained), sub-second actions in a single-stage transformer, eliminating the need for multi-stage methods and post-processing. Without any refinements, HRTR outperforms state-of-the-art systems on both stroke related and general datasets, achieving Edit Score (ES) of 70.1 on StrokeRehab Video, 69.4 on StrokeRehab IMU, and 88.4 on 50Salads.

Problem

Research questions and friction points this paper is trying to address.

Fine-grained sub-second action segmentation in stroke rehabilitation

Single-stage transformer for precise movement tracking

Eliminating multi-stage methods and post-processing needs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single-stage transformer for action segmentation

High-resolution sub-second action detection

Outperforms state-of-the-art without refinements

🔎 Similar Papers

No similar papers found.

Authors to Follow