TEMSET-24K: Densely Annotated Dataset for Indexing Multipart Endoscopic Videos using Surgical Timeline Segmentation

๐Ÿ“… 2025-02-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Endoscopic surgical video indexing has long relied on inefficient manual annotation, while the scarcity of publicly available, densely annotated datasets severely hinders the development of automated methods. Method: We introduce TEMS, the first public 24K-scale dataset of fine-grained endoscopic surgical video snippets, annotated using a hierarchical โ€œphaseโ€“taskโ€“actionโ€ triadic schema co-designed with clinical experts to enable precise, interpretable, and openly shareable surgical workflow modeling. We propose STALNet, an end-to-end framework integrating ConvNeXt, ViT, and Swin V2 encoders to jointly perform surgical phase segmentation and multi-granularity temporal localization. Contribution/Results: STALNet achieves 0.99 accuracy and F1 score on critical phases (e.g., Setup, Suturing), substantially advancing automated surgical video indexing. TEMS fills a critical gap in high-quality, expert-validated annotations and establishes a new benchmark and methodological paradigm for surgical video understanding.

Technology Category

Application Category

๐Ÿ“ Abstract
Indexing endoscopic surgical videos is vital in surgical data science, forming the basis for systematic retrospective analysis and clinical performance evaluation. Despite its significance, current video analytics rely on manual indexing, a time-consuming process. Advances in computer vision, particularly deep learning, offer automation potential, yet progress is limited by the lack of publicly available, densely annotated surgical datasets. To address this, we present TEMSET-24K, an open-source dataset comprising 24,306 trans-anal endoscopic microsurgery (TEMS) video micro-clips. Each clip is meticulously annotated by clinical experts using a novel hierarchical labeling taxonomy encompassing phase, task, and action triplets, capturing intricate surgical workflows. To validate this dataset, we benchmarked deep learning models, including transformer-based architectures. Our in silico evaluation demonstrates high accuracy (up to 0.99) and F1 scores (up to 0.99) for key phases like Setup and Suturing. The STALNet model, tested with ConvNeXt, ViT, and SWIN V2 encoders, consistently segmented well-represented phases. TEMSET-24K provides a critical benchmark, propelling state-of-the-art solutions in surgical data science.
Problem

Research questions and friction points this paper is trying to address.

Automated indexing of endoscopic surgical videos
Lack of densely annotated surgical datasets
Validation of deep learning models for surgical workflow segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning for video indexing
Transformer-based surgical segmentation
Hierarchical labeling taxonomy application
๐Ÿ”Ž Similar Papers
No similar papers found.
M
Muhammad Bilal
Birmingham City University, United Kingdom
Mahmood Alam
Mahmood Alam
Student at Birmingham City University, Birmingham, UK
Surgical Data ScienceGreen AIVideo AnalyticsDeep LearningSemantic Web
D
Deepa Bapu
University Hospitals Birmingham, Birmingham, United Kingdom
S
Stephan Korsgen
University Hospitals Birmingham, Birmingham, United Kingdom
Neeraj Lal
Neeraj Lal
University of Birmingham
Cancer ImmunologyColorectal Cancer
S
Simon Bach
University Hospitals Birmingham, Birmingham, United Kingdom; University of Birmingham, Birmingham, United Kingdom
A
Amir M Hajivanand
University of Birmingham, Birmingham, United Kingdom
M
Muhammed Ali
University Hospitals Birmingham, Birmingham, United Kingdom
K
Kamran Soomro
University of the West of England, Bristol, United Kingdom
Iqbal Qasim
Iqbal Qasim
University of Hertfordshire
Applied AISemantic WebInformation retrievalData Mining
P
Pawel Capik
University of the West of England, Bristol, United Kingdom
A
Aslam Khan
University of Bradford, United Kingdom
Z
Zaheer Khan
University of the West of England, Bristol, United Kingdom
H
Hunaid Vohra
University of Bristol, United Kingdom
M
Massimo Caputo
University of Bristol, United Kingdom
Andrew Beggs
Andrew Beggs
Professor of Surgery & Cancer Genetics
Cancer GeneticsColorectal CancerPelvic FloorBioinformatics
Adnan Qayyum
Adnan Qayyum
Hamad Bin Khalifa University (HBKU), Doha, Qatar
Medical Image AnalysisMachine LearningHealthcareRobust Machine Learning
Junaid Qadir
Junaid Qadir
Professor of Computer Engineering, Qatar University
Human-centered AIAI EthicsEngineering EducationAI in EducationHealthcare AI
S
Shazad Ashraf
Birmingham City University, United Kingdom; University Hospitals Birmingham, Birmingham, United Kingdom; University of Birmingham, Birmingham, United Kingdom