POPSICLE: Benchmark Datasets for Segmentation and Localization in CryoET

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the long-standing lack of standardized, high-quality benchmark datasets in cryo-electron tomography (cryoET), which has hindered fair evaluation and reproducible research in machine learning. To bridge this gap, the authors present an open benchmark suite tailored for cryoET image segmentation and macromolecular localization tasks. It integrates diverse in situ and purified samples from the CryoET Data Portal, spanning both eukaryotic and prokaryotic systems, and uniformly supports both dense segmentation and sparse localization. The benchmark employs a standardized annotation protocol and establishes, for the first time, a scalable evaluation framework designed specifically to address cryoET’s unique challenges. Experimental results reveal substantial performance disparities among existing models across tasks, underscoring the benchmark’s critical role in advancing algorithm development and enabling reliable assessment in the field.
📝 Abstract
Cryo-electron tomography (cryoET) has emerged as a powerful tool in structural and cellular biology by enabling direct visualization of macromolecular structures within intact cells, thereby linking molecular architecture to cellular organization in a native context. Realizing the full potential of cryoET, however, increasingly depends on advances in computational analysis, particularly machine learning (ML), to interpret its complex and information-rich data. Despite rapid progress, ML development for cryoET remains bottlenecked by the lack of standardized, well-annotated benchmarks. Existing evaluations are typically small, task-specific, and are assembled in isolation, limiting robust comparisons across methods. Here, we present POPSICLE, a benchmark suite for cryoET segmentation and macromolecular localization built from the CryoET Data Portal - an open, ML-ready repository of tomographic data, metadata, and annotations. POPSICLE spans eukaryotic and prokaryotic systems, both purified and fully in situ samples, and dense voxel-wise segmentation as well as sparse localization tasks. Built on a living data resource, it can expand as new datasets and annotations become available. Baseline experiments reveal substantial variation in model rankings across tasks, underscoring the need for benchmarks tailored to the unique characteristics of cryoET rather than evaluation practices adapted from adjacent biomedical imaging domains. POPSICLE thus provides an open and extensible foundation for reproducible ML evaluation in cryoET.
Problem

Research questions and friction points this paper is trying to address.

cryoET
benchmark
segmentation
localization
machine learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

cryoET
benchmark
segmentation
localization
machine learning
🔎 Similar Papers
Jonathan Schwartz
Jonathan Schwartz
Chan Zuckerberg Imaging Institute
Electron MicroscopyTomographyImage Processing
U
Utz Heinrich Ermel
Biohub, Redwood City, CA 94063, USA
C
C. Braxton Owens
Brigham Young University, Provo, UT, 84602, USA
Z
Zhuowen Zhao
Biohub, Redwood City, CA 94063, USA
A
Ariana Peck
Biohub, Redwood City, CA 94063, USA
Gus L. W. Hart
Gus L. W. Hart
Brigham Young University, Physics and Astronomy
Biophysicsalgorithm developmentapplied linear algebramachine learningcomputational science
G
Grant J. Jensen
Brigham Young University, Provo, UT, 84602, USA
B
Bridget Carragher
Biohub, Redwood City, CA 94063, USA
Dari Kimanius
Dari Kimanius
Chan Zuckerberg Institute for Advanced Biological Imaging
Machine LearningDeep LearningComputer VisionStructural BiologyCryo-EM