🤖 AI Summary
To address the labor-intensive and inefficient manual annotation of surgical objects in medical augmented reality (AR), this paper proposes the first fully automated, real-time mask generation method deployed natively on the HoloLens 2. Our approach innovatively integrates the zero-shot, parameter-free SAM-Track algorithm into a Unity-Python hybrid framework, enabling streaming segmentation and AR annotation with cross-scenario generalization. It requires no model training or hyperparameter tuning—only a single-frame initialization suffices for continuous tracking. Evaluated on open hepatic surgery and anatomical phantom datasets, our method achieves annotation speeds over 500× faster than manual labeling, with Dice scores ranging from 0.875 to 0.982—comparable in quality to expert annotations. This work constitutes the first empirical validation of real-time segmentation feasibility for SAM-family models on resource-constrained AR headsets, establishing a new paradigm for efficient, robust, and fully automated surgical object annotation in clinical AR navigation.
📝 Abstract
In the context of medical Augmented Reality (AR) applications, object tracking is a key challenge and requires a significant amount of annotation masks. As segmentation foundation models like the Segment Anything Model (SAM) begin to emerge, zero-shot segmentation requires only minimal human participation obtaining high-quality object masks. We introduce a HoloLens-Object-Labeling (HOLa) Unity and Python application based on the SAM-Track algorithm that offers fully automatic single object annotation for HoloLens 2 while requiring minimal human participation. HOLa does not have to be adjusted to a specific image appearance and could thus alleviate AR research in any application field.We evaluate HOLa for different degrees of image complexity in open liver surgery and in medical phantom experiments. Using HOLa for image annotation can increase the labeling speed by more than 500 times while providing Dice scores between 0.875 and 0.982, which are comparable to human annotators. Our code is publicly available at: https://github.com/mschwimmbeck/HOLa.