๐ค AI Summary
To address the expert dependency in fetal echocardiographic view identification and the limited robustness of single-frame classification, this paper proposes an automatic view classification method leveraging short video clips. Specifically, it takes four consecutive frames as input and introduces a Temporal Feature Weaving strategy that seamlessly integrates spatial features extracted by CNNs with temporal dynamics modeled by GRUsโachieving enhanced discriminability with negligible computational overhead. We present NED, the first publicly available, expert-annotated neonatal echocardiography video dataset comprising 16 clinically relevant views. Experiments demonstrate that our method improves video-level view classification accuracy by 4.33% over single-frame baselines. Both the source code and the NED dataset are openly released to foster reproducible research.
๐ Abstract
Automated viewpoint classification in echocardiograms can help under-resourced clinics and hospitals in providing faster diagnosis and screening when expert technicians may not be available. We propose a novel approach towards echocardiographic viewpoint classification. We show that treating viewpoint classification as video classification rather than image classification yields advantage. We propose a CNN-GRU architecture with a novel temporal feature weaving method, which leverages both spatial and temporal information to yield a 4.33% increase in accuracy over baseline image classification while using only four consecutive frames. The proposed approach incurs minimal computational overhead. Additionally, we publish the Neonatal Echocardiogram Dataset (NED), a professionally-annotated dataset providing sixteen viewpoints and associated echocardipgraphy videos to encourage future work and development in this field. Code available at: https://github.com/satchelfrench/NED