AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of evaluating and generalizing high-precision, multimodal robotic manipulation in digital biology laboratories. Methodologically, we introduce the first vision-language-action (VLA) benchmark tailored to professional scientific settings, built upon a simulation environment that supports language-guided fine manipulation. The environment integrates instrument digital twins, extended physics engines, physically based rendering (PBR), and dynamic GUI rendering, and features a diverse task suite spanning transparent-object manipulation, precision mechatronic control, and real-world experimental protocols. Key contributions include: (1) the first scientific-domain-oriented VLA evaluation framework; (2) open-source release of a reproducible simulation platform and benchmark suite; and (3) empirical analysis revealing critical limitations of current state-of-the-art VLA models in fine-grained action execution, visual reasoning, and precise instruction following—establishing a standardized assessment foundation for biologically grounded robotic automation research.

Technology Category

Application Category

📝 Abstract
Vision-language-action (VLA) models have shown promise as generalist robotic policies by jointly leveraging visual, linguistic, and proprioceptive modalities to generate action trajectories. While recent benchmarks have advanced VLA research in domestic tasks, professional science-oriented domains remain underexplored. We introduce AutoBio, a simulation framework and benchmark designed to evaluate robotic automation in biology laboratory environments--an application domain that combines structured protocols with demanding precision and multimodal interaction. AutoBio extends existing simulation capabilities through a pipeline for digitizing real-world laboratory instruments, specialized physics plugins for mechanisms ubiquitous in laboratory workflows, and a rendering stack that support dynamic instrument interfaces and transparent materials through physically based rendering. Our benchmark comprises biologically grounded tasks spanning three difficulty levels, enabling standardized evaluation of language-guided robotic manipulation in experimental protocols. We provide infrastructure for demonstration generation and seamless integration with VLA models. Baseline evaluations with two SOTA VLA models reveal significant gaps in precision manipulation, visual reasoning, and instruction following in scientific workflows. By releasing AutoBio, we aim to catalyze research on generalist robotic systems for complex, high-precision, and multimodal professional environments. The simulator and benchmark are publicly available to facilitate reproducible research.
Problem

Research questions and friction points this paper is trying to address.

Evaluating robotic automation in biology labs
Advancing vision-language-action models for science tasks
Bridging gaps in precision manipulation and visual reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Digitizing real-world lab instruments pipeline
Specialized physics plugins for lab mechanisms
Physically based rendering for dynamic interfaces
🔎 Similar Papers
No similar papers found.
Zhiqian Lan
Zhiqian Lan
The University of Hong Kong; TeleAI
embodied AIautonomous drivingreinforcement learning
Y
Yuxuan Jiang
HKU, TeleAI
R
Ruiqi Wang
THU
X
Xuanbing Xie
HKU
R
Rongkui Zhang
THU
Y
Yicheng Zhu
SJTU
P
Peihang Li
HKU
T
Tianshuo Yang
HKU
T
Tianxing Chen
HKU
H
Haoyu Gao
THU
X
Xiaokang Yang
SJTU
X
Xuelong Li
TeleAI
H
Hongyuan Zhang
TeleAI
Y
Yao Mu
SJTU
Ping Luo
Ping Luo
National University of Defense Technology
distributed_computing