anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing ECG multimodal large language models (MLLMs) are limited to single-task (report generation) operation with fixed short-duration 12-lead inputs and lack diverse, clinically representative evaluation benchmarks. Method: We propose anyECG—the first general-purpose ECG MLLM—supporting multi-task capabilities (report generation, abnormal waveform localization, open-ended question answering) and flexible input formats (single/multiple ECGs; short/long-duration; full/reduced-lead). We introduce a dynamic-length ECG encoder and a multi-ECG joint understanding architecture, construct the first clinical-scenario-covering anyECG dataset, and employ a three-stage curriculum learning strategy with adaptive visual encoding and cross-modal alignment fine-tuning. Contribution/Results: Experiments demonstrate significant superiority over state-of-the-art methods in report generation, home-based long-term ECG abnormality localization, and multi-ECG comparative analysis—achieving, for the first time, clinically oriented multi-task generalization.

Technology Category

Application Category

📝 Abstract
The advent of multimodal large language models (MLLMs) has sparked interest in their application to electrocardiogram (ECG) analysis. However, existing ECG-focused MLLMs primarily focus on report generation tasks, often limited to single 12-lead, short-duration (10s) ECG inputs, thereby underutilizing the potential of MLLMs. To this end, we aim to develop a MLLM for ECG analysis that supports a broader range of tasks and more flexible ECG inputs. However, existing ECG-QA datasets are often monotonous. To address this gap, we first constructed the anyECG dataset, which encompasses a wide variety of tasks, including report generation, abnormal waveform localization, and open-ended question answering. In addition to standard hospital ECGs, we introduced long-duration reduced-lead ECGs for home environments and multiple ECG comparison scenarios commonly encountered in clinical practice. Furthermore, we propose the anyECG-chat model, which supports dynamic-length ECG inputs and multiple ECG inputs. We trained the model using a three-stage curriculum training recipe with the anyECG dataset. A comprehensive evaluation was conducted, demonstrating that anyECG-chat is capable of supporting various practical application scenarios, including not only common report generation tasks but also abnormal waveform localization for long-duration reduced-lead ECGs in home environments and comprehensive comparative analysis of multiple ECGs.
Problem

Research questions and friction points this paper is trying to address.

Develops MLLM for flexible ECG inputs and multi-task analysis
Addresses lack of diverse ECG-QA datasets with anyECG dataset
Supports dynamic-length and multiple ECG inputs for clinical scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Supports dynamic-length ECG inputs
Handles multiple ECG inputs simultaneously
Three-stage curriculum training approach
🔎 Similar Papers
No similar papers found.