🤖 AI Summary
This work addresses the longstanding challenge in RNA rational design of jointly generating sequences and all-atom 3D structures. We propose the first multimodal flow-matching model specifically designed for RNA. Methodologically, we innovatively integrate discrete flow matching (for sequence modeling), standard continuous flow matching (for atomic coordinate modeling), and Euclidean-equivariant neural networks into a unified equivariant multi-flow co-generation framework—enabling geometry-aware, synchronous generation of sequence and structure. Unlike conventional stepwise pipelines, our model achieves end-to-end joint generation of RNA sequences and all-atom structures. Quantitative evaluation demonstrates significant improvements over state-of-the-art baselines across chemical validity, sequence–structure consistency, conformational diversity, and physical plausibility. This establishes a new paradigm for RNA therapeutics and synthetic biology.
📝 Abstract
Ribonucleic acid (RNA) plays fundamental roles in biological systems, from carrying genetic information to performing enzymatic function. Understanding and designing RNA can enable novel therapeutic application and biotechnological innovation. To enhance RNA design, in this paper we introduce RiboGen, the first deep learning model to simultaneously generate RNA sequence and all-atom 3D structure. RiboGen leverages the standard Flow Matching with Discrete Flow Matching in a multimodal data representation. RiboGen is based on Euclidean Equivariant neural networks for efficiently processing and learning three-dimensional geometry. Our experiments show that RiboGen can efficiently generate chemically plausible and self-consistent RNA samples. Our results suggest that co-generation of sequence and structure is a competitive approach for modeling RNA.