cp_measure: API-first feature extraction for image-based profiling workflows

📅 2025-07-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current bioimage analysis tools (e.g., CellProfiler) face bottlenecks in automated and reproducible feature extraction, hindering scalable deployment of machine learning workflows. To address this, we introduce *cp_measure*—a modular, API-first Python library that decouples CellProfiler’s core measurement engine and refactors it into a programmable interface, enabling seamless integration with the scientific Python ecosystem. The library supports end-to-end phenotypic analysis of 2D/3D cellular imaging and spatial transcriptomics data, ensuring highly consistent (Pearson correlation >0.999 with CellProfiler) and fully reproducible feature extraction. Empirical evaluation demonstrates its efficiency and scalability across multi-batch, multimodal biological image datasets. *cp_measure* significantly enhances robustness and productivity in feature-driven computational biology modeling, while preserving compatibility with established CellProfiler pipelines.

Technology Category

Application Category

📝 Abstract
Biological image analysis has traditionally focused on measuring specific visual properties of interest for cells or other entities. A complementary paradigm gaining increasing traction is image-based profiling - quantifying many distinct visual features to form comprehensive profiles which may reveal hidden patterns in cellular states, drug responses, and disease mechanisms. While current tools like CellProfiler can generate these feature sets, they pose significant barriers to automated and reproducible analyses, hindering machine learning workflows. Here we introduce cp_measure, a Python library that extracts CellProfiler's core measurement capabilities into a modular, API-first tool designed for programmatic feature extraction. We demonstrate that cp_measure features retain high fidelity with CellProfiler features while enabling seamless integration with the scientific Python ecosystem. Through applications to 3D astrocyte imaging and spatial transcriptomics, we showcase how cp_measure enables reproducible, automated image-based profiling pipelines that scale effectively for machine learning applications in computational biology.
Problem

Research questions and friction points this paper is trying to address.

Automating image-based profiling for cellular state analysis
Overcoming barriers in reproducible feature extraction workflows
Integrating CellProfiler measurements with machine learning pipelines
Innovation

Methods, ideas, or system contributions that make the work stand out.

API-first Python library for feature extraction
Modular design for programmatic integration
High-fidelity compatibility with CellProfiler features
🔎 Similar Papers
No similar papers found.
A
Alán F. Muñoz
Broad Institute of MIT and Harvard, United States
T
Tim Treis
Institute of Computational Biology, Helmholtz Zentrum München, Germany
Alexandr A. Kalinin
Alexandr A. Kalinin
Senior ML Scientist, CZ Biohub SF
Biomedical Image AnalysisMachine Learning
S
Shatavisha Dasgupta
Broad Institute of MIT and Harvard, United States
F
Fabian Theis
Institute of Computational Biology, Helmholtz Zentrum München, Germany
Anne E. Carpenter
Anne E. Carpenter
Institute Scientist and Imaging Platform Director, Broad Institute of Harvard and MIT
drug discoverymachine learningCell Paintingimage-based profilinghigh content screening
Shantanu Singh
Shantanu Singh
Broad Institute of MIT and Harvard, United States