MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance

📅 2024-12-19

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Zero-shot interactive segmentation of novel medical imaging datasets faces bottlenecks including high annotation cost and reliance on pre-trained labels or historical annotations. This paper proposes a context-aware incremental interactive segmentation paradigm: given an initial user query (e.g., click, bounding box, or scribble), it dynamically integrates previously segmented images as contextual memory and employs a Transformer-based multimodal encoder, dynamic prompt fusion, and online cache retrieval to enable continual adaptive learning—without domain-specific priors or pre-trained labels. Its core innovation is diminishing per-interaction annotation cost as the annotated dataset scales. Experiments demonstrate that, to achieve 90% Dice score on unseen tasks, the method reduces scribble steps by 53% and click counts by 36%, while enabling cross-modal zero-shot transfer across MRI, CT, and microscopy modalities.

Technology Category

Application Category

📝 Abstract

Medical researchers and clinicians often need to perform novel segmentation tasks on a set of related images. Existing methods for segmenting a new dataset are either interactive, requiring substantial human effort for each image, or require an existing set of manually labeled images. We introduce a system, MultiverSeg, that enables practitioners to rapidly segment an entire new dataset without requiring access to any existing labeled data from that task or domain. Along with the image to segment, the model takes user interactions such as clicks, bounding boxes or scribbles as input, and predicts a segmentation. As the user segments more images, those images and segmentations become additional inputs to the model, providing context. As the context set of labeled images grows, the number of interactions required to segment each new image decreases. We demonstrate that MultiverSeg enables users to interactively segment new datasets efficiently, by amortizing the number of interactions per image to achieve an accurate segmentation. Compared to using a state-of-the-art interactive segmentation method, using MultiverSeg reduced the total number of scribble steps by 53% and clicks by 36% to achieve 90% Dice on sets of images from unseen tasks. We release code and model weights at https://multiverseg.csail.mit.edu

Problem

Research questions and friction points this paper is trying to address.

Enables rapid segmentation of new biomedical datasets without pre-labeled data

Reduces user interactions needed per image as context grows

Amortizes clicks and scribbles to achieve accurate segmentations efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

In-context learning with user interactions

Amortizes interactions across dataset segmentation

No pre-labeled data required for new tasks

🔎 Similar Papers

No similar papers found.

Authors to Follow