🤖 AI Summary
RNA design faces challenges including high structural flexibility and scarcity of experimental 3D structures, making it difficult for existing methods to jointly satisfy diverse constraints—such as phylogenetic family membership, secondary/tertiary structural specifications, and functional site requirements—in a unified generative framework. To address this, we propose the first flow matching–based universal conditional generation framework for RNA sequence design. Our approach features a modular conditional encoder that enables discrete sequence representation and unifies multiple design paradigms—including family-specific generation, structure-constrained design, and binding-site–guided inverse folding. The framework significantly improves conditional controllability and cross-task generalization. It achieves state-of-the-art performance on Rfam family generation, secondary structure design, and PDB binding-site inverse folding, producing sequences with both high structural validity and functional compatibility.
📝 Abstract
RNA plays a pivotal role in diverse biological processes, ranging from gene regulation to catalysis. Recent advances in RNA design, such as RfamGen, Ribodiffusion and RDesign, have demonstrated promising results, with successful designs of functional sequences. However, RNA design remains challenging due to the inherent flexibility of RNA molecules and the scarcity of experimental data on tertiary and secondary structures compared to proteins. These limitations highlight the need for a more universal and comprehensive approach to RNA design that integrates diverse annotation information at the sequence level. To address these challenges, we propose RNACG (RNA Conditional Generator), a universal framework for RNA sequence design based on flow matching. RNACG supports diverse conditional inputs, including structural, functional, and family-specific annotations, and offers a modular design that allows users to customize the encoding network for specific tasks. By unifying sequence generation under a single framework, RNACG enables the integration of multiple RNA design paradigms, from family-specific generation to tertiary structure inverse folding.