🤖 AI Summary
This study addresses the challenge of noninvasively assessing cardiovascular disease risk using carotid ultrasound videos, with hypertension status serving as a visual proxy for underlying arterial damage. Methodologically, we adapt the VideoMAE self-supervised video representation learning model—previously developed for natural videos—to the medical ultrasound domain for the first time, establishing an end-to-end spatiotemporal feature extraction framework. The model is trained and validated on the large-scale prospective Gutenberg Health Study cohort, comprising over 31,000 carotid ultrasound videos. Key contributions include: (i) the first medical adaptation of VideoMAE for ultrasound video analysis; (ii) implicit modeling of latent arterial pathology via hypertension labels, circumventing the need for explicit lesion annotations; and (iii) extraction of physiologically interpretable spatiotemporal features reflecting dynamic vascular biomechanics. Our model achieves 75.7% classification accuracy—significantly outperforming conventional image-level approaches—and establishes a novel paradigm for early cardiovascular risk screening.
📝 Abstract
In this study, hypertension is utilized as an indicator of individual vascular damage. This damage can be identified through machine learning techniques, providing an early risk marker for potential major cardiovascular events and offering valuable insights into the overall arterial condition of individual patients. To this end, the VideoMAE deep learning model, originally developed for video classification, was adapted by finetuning for application in the domain of ultrasound imaging. The model was trained and tested using a dataset comprising over 31,000 carotid sonography videos sourced from the Gutenberg Health Study (15,010 participants), one of the largest prospective population health studies. This adaptation facilitates the classification of individuals as hypertensive or non-hypertensive (75.7% validation accuracy), functioning as a proxy for detecting visual arterial damage. We demonstrate that our machine learning model effectively captures visual features that provide valuable insights into an individual's overall cardiovascular health.