🤖 AI Summary
Lightweight and interpretable Transformer models for multi-lead ECG diagnosis remain scarce. Method: This paper proposes a hierarchical Transformer architecture with adaptive embedding scaling. It eliminates manual downsampling and complex attention designs by partitioning the embedding space along the temporal dimension for multi-stage processing; introduces cross-scale classification tokens for inter-stage information aggregation; and integrates a deep convolutional encoder (6 layers) with an attention gating mechanism to explicitly model inter-lead dependencies and enhance feature interpretability. Contribution/Results: The model supports dynamic input lengths and configurable embeddings, achieving significant improvements in diagnostic accuracy and clinical trustworthiness while maintaining lightweight design—establishing a novel paradigm for interpretable AI-assisted ECG analysis.
📝 Abstract
Transformers, originally prominent in NLP and computer vision, are now being adapted for ECG signal analysis. This paper introduces a novel hierarchical transformer architecture that segments the model into multiple stages by assessing the spatial size of the embeddings, thus eliminating the need for additional downsampling strategies or complex attention designs. A classification token aggregates information across feature scales, facilitating interactions between different stages of the transformer. By utilizing depth-wise convolutions in a six-layer convolutional encoder, our approach preserves the relationships between different ECG leads. Moreover, an attention gate mechanism learns associations among the leads prior to classification. This model adapts flexibly to various embedding networks and input sizes while enhancing the interpretability of transformers in ECG signal analysis.