Orochi: Versatile Biomedical Image Processor

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing biomedical image processing plugins are typically task- or dataset-specific, exhibiting poor generalization and limited versatility, thus failing to meet biologists’ diverse analytical needs. Method: We propose Task-related Joint Embedding Pre-training (TJP), a novel framework that abandons masked modeling and instead integrates stochastic multi-scale sampling with a hierarchical Mamba architecture—including Multi-head Hierarchy Mamba—to enable strong cross-task and cross-dataset generalization. Leveraging Mamba’s linear-complexity sequence modeling, we further design a three-stage fine-tuning strategy (Full/Normal/Light) to balance efficiency and performance. Results: On downstream tasks—including image registration, fusion, denoising, and super-resolution—TJP matches or surpasses state-of-the-art task-specific models. Even under lightweight fine-tuning, it maintains superior performance, significantly reducing model selection overhead. To our knowledge, this work establishes the first efficient, general-purpose platform for low-level biomedical image processing.

Technology Category

Application Category

📝 Abstract
Deep learning has emerged as a pivotal tool for accelerating research in the life sciences, with the low-level processing of biomedical images (e.g., registration, fusion, restoration, super-resolution) being one of its most critical applications. Platforms such as ImageJ (Fiji) and napari have enabled the development of customized plugins for various models. However, these plugins are typically based on models that are limited to specific tasks and datasets, making them less practical for biologists. To address this challenge, we introduce Orochi, the first application-oriented, efficient, and versatile image processor designed to overcome these limitations. Orochi is pre-trained on patches/volumes extracted from the raw data of over 100 publicly available studies using our Random Multi-scale Sampling strategy. We further propose Task-related Joint-embedding Pre-Training (TJP), which employs biomedical task-related degradation for self-supervision rather than relying on Masked Image Modelling (MIM), which performs poorly in downstream tasks such as registration. To ensure computational efficiency, we leverage Mamba's linear computational complexity and construct Multi-head Hierarchy Mamba. Additionally, we provide a three-tier fine-tuning framework (Full, Normal, and Light) and demonstrate that Orochi achieves comparable or superior performance to current state-of-the-art specialist models, even with lightweight parameter-efficient options. We hope that our study contributes to the development of an all-in-one workflow, thereby relieving biologists from the overwhelming task of selecting among numerous models.
Problem

Research questions and friction points this paper is trying to address.

Develops versatile biomedical image processor for multiple tasks
Addresses limitations of task-specific models in biology research
Provides efficient pre-training and fine-tuning framework for images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-trained on 100+ biomedical datasets using multi-scale sampling
Uses task-related joint-embedding pre-training instead of MIM
Leverages Mamba architecture for linear computational efficiency
🔎 Similar Papers
No similar papers found.
Gaole Dai
Gaole Dai
PhD Candidate, Peking University
AI X LifeScience
C
Chenghao Zhou
Academy for Advanced Interdisciplinary Studies, Peking University
Y
Yu Zhou
Leibniz-Institut für Analytische Wissenschaften – ISAS – e.V .
R
Rongyu Zhang
School of Computer Science, Peking University
Y
Yuan Zhang
School of Computer Science, Peking University
Chengkai Hou
Chengkai Hou
Peking University
Robot
Tiejun Huang
Tiejun Huang
Professor,School of Computer Science, Peking University
Visual Information Processing
Jianxu Chen
Jianxu Chen
Group Leader, Leibniz-Institut für Analytische Wissenschaften – ISAS
Deep learning in biomedical image analysis and computer Vision
Shanghang Zhang
Shanghang Zhang
Peking University
Embodied AIFoundation Models