AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of evaluating and generalizing high-precision, multimodal robotic manipulation in digital biology laboratories. Methodologically, we introduce the first vision-language-action (VLA) benchmark tailored to professional scientific settings, built upon a simulation environment that supports language-guided fine manipulation. The environment integrates instrument digital twins, extended physics engines, physically based rendering (PBR), and dynamic GUI rendering, and features a diverse task suite spanning transparent-object manipulation, precision mechatronic control, and real-world experimental protocols. Key contributions include: (1) the first scientific-domain-oriented VLA evaluation framework; (2) open-source release of a reproducible simulation platform and benchmark suite; and (3) empirical analysis revealing critical limitations of current state-of-the-art VLA models in fine-grained action execution, visual reasoning, and precise instruction following—establishing a standardized assessment foundation for biologically grounded robotic automation research.

Technology Category

Application Category

📝 Abstract

Vision-language-action (VLA) models have shown promise as generalist robotic policies by jointly leveraging visual, linguistic, and proprioceptive modalities to generate action trajectories. While recent benchmarks have advanced VLA research in domestic tasks, professional science-oriented domains remain underexplored. We introduce AutoBio, a simulation framework and benchmark designed to evaluate robotic automation in biology laboratory environments--an application domain that combines structured protocols with demanding precision and multimodal interaction. AutoBio extends existing simulation capabilities through a pipeline for digitizing real-world laboratory instruments, specialized physics plugins for mechanisms ubiquitous in laboratory workflows, and a rendering stack that support dynamic instrument interfaces and transparent materials through physically based rendering. Our benchmark comprises biologically grounded tasks spanning three difficulty levels, enabling standardized evaluation of language-guided robotic manipulation in experimental protocols. We provide infrastructure for demonstration generation and seamless integration with VLA models. Baseline evaluations with two SOTA VLA models reveal significant gaps in precision manipulation, visual reasoning, and instruction following in scientific workflows. By releasing AutoBio, we aim to catalyze research on generalist robotic systems for complex, high-precision, and multimodal professional environments. The simulator and benchmark are publicly available to facilitate reproducible research.

Problem

Research questions and friction points this paper is trying to address.

Evaluating robotic automation in biology labs

Advancing vision-language-action models for science tasks

Bridging gaps in precision manipulation and visual reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Digitizing real-world lab instruments pipeline

Specialized physics plugins for lab mechanisms

Physically based rendering for dynamic interfaces

🔎 Similar Papers

No similar papers found.

Authors to Follow