mRNAutilus: Multi-Objective-Guided Discrete Generation of mRNA with Optimized Therapeutic Properties

📅 2026-05-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

160K/year
🤖 AI Summary
This work proposes mRNAutilus, a novel framework for end-to-end design of therapeutic mRNA transcripts that jointly optimizes coding sequences and untranslated regions (UTRs) to enhance mRNA stability, translational efficiency, and protein expression. Integrating a masked discrete diffusion model with Monte Carlo tree search, mRNAutilus simultaneously performs codon optimization and de novo UTR design within a unified generative process. A lightweight regressor operating in the embedding space predicts multiple functional properties, enabling Pareto-efficient multi-objective optimization to identify high-performing sequences. In zero-shot experiments, mRNAutilus-generated luciferase mRNAs achieved protein expression levels up to 400-fold higher than wild-type constructs, while SARS-CoV-2 spike protein mRNAs outperformed both commercial and clinical benchmarks. The framework further demonstrated significant improvements in diverse therapeutic contexts, including gene editing and targeted protein degradation.
📝 Abstract
Therapeutic mRNA design requires coordinating multiple interacting sequence features across the full transcript, where codon usage, untranslated regions (UTRs), and their coupling jointly determine stability, translation efficiency, and protein expression. Here, we present mRNA generation via unrolled trajectories and informed latent updates (mRNAutilus), a framework for simultaneous codon optimization and de novo UTR design directly from sequence. mRNAutilus combines a masked discrete diffusion model trained on millions of full-length mRNAs with Monte Carlo Tree Guidance to generate Pareto-efficient sequences under multiple functional objectives, using lightweight regressors over model embeddings to predict half-life, translation efficiency, and protein abundance. Unlike recent methods that design coding sequences and UTRs separately or rely on post hoc assembly and screening, mRNAutilus generates complete transcripts in a single process optimized across properties. Across diverse targets, zero-shot mRNAs encoding P. pyralis luciferase achieve over 400-fold higher expression than wild-type and outperform commercial and machine learning-designed baselines, including zero-shot generative approaches. Zero-shot SARS-CoV-2 Spike mRNAs exceed clinically used and commercial constructs and match or surpass lab-optimized designs with improved durability. We further demonstrate generality in therapeutic settings, including prime editing (PEMax) and programmable proteome modulation, where mRNAutilus-designed constructs enhance expression of peptide-guided E3 ligases (uAbs) for beta-catenin degradation. These results establish a sequence-based, multi-objective framework for generating functional mRNAs tailored to diverse biological applications.
Problem

Research questions and friction points this paper is trying to address.

mRNA design
multi-objective optimization
codon usage
UTRs
therapeutic mRNA
Innovation

Methods, ideas, or system contributions that make the work stand out.

discrete diffusion model
multi-objective optimization
de novo UTR design
Monte Carlo Tree Guidance
zero-shot mRNA generation
🔎 Similar Papers
No similar papers found.
Sawan Patel
Sawan Patel
Machine Learning Engineer, Atom Bioworks
machine learningaptamersdiffusiontransformers
Sophia Tang
Sophia Tang
University of Pennsylvania
deep learningdrug deliverygenerative modelsdrug design
Y
Yesol Kim
Department of Bioengineering, University of Pennsylvania
Yinuo Zhang
Yinuo Zhang
PhD student, DUKE-NUS Medical School
ProteinPeptidesBiologyDeep Learning
D
Divya Srijay
Department of Bioengineering, University of Pennsylvania
P
Ping-Jung Lin
GenScript USA, Inc.
S
Shambhavi Shubham
GenScript USA, Inc.
F
Fengmei Pi
GenScript USA, Inc.
C
Cedric Wu
GenScript USA, Inc.
S
Sherwood Yao
Atom Bioworks, Inc.
Pranam Chatterjee
Pranam Chatterjee
University of Pennsylvania
Protein DesignLanguage ModelingGenome EditingMachine LearningGametogenesis