DNA Tails for Molecular Flash Memory

📅 2025-05-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
DNA-based data storage faces fundamental limitations in high synthesis costs and low coding density (e.g., DNA Punchcards store only 1 bit per nick site). To address this, we propose DNA Tails—a novel molecular encoding paradigm that leverages enzymatic synthesis of variable-length single-stranded DNA tails at backbone nick sites, enabling non-binary, multi-bit-per-site storage and substantially increasing molecular-level storage density. Our key contributions are: (1) the first introduction of tail-length modulation as a coding mechanism; (2) the design of rank-modulation and permutation codes robust against “sticking” errors—specifically calibrated to correct both calibration drift and truncation-induced growth errors; and (3) an optimal construction of redundant permutation codes with efficient encoding/decoding algorithms. Experimental validation confirms the feasibility of the encoding scheme and demonstrates strong error resilience under realistic biochemical constraints.

Technology Category

Application Category

📝 Abstract
DNA-based data storage systems face practical challenges due to the high cost of DNA synthesis. A strategy to address the problem entails encoding data via topological modifications of the DNA sugar-phosphate backbone. The DNA Punchcards system, which introduces nicks (cuts) in the DNA backbone, encodes only one bit per nicking site, limiting density. We propose emph{DNA Tails,} a storage paradigm that encodes nonbinary symbols at nicking sites by growing enzymatically synthesized single-stranded DNA of varied lengths. The average tail lengths encode multiple information bits and are controlled via a staggered nicking-tail extension process. We demonstrate the feasibility of this encoding approach experimentally and identify common sources of errors, such as calibration errors and stumped tail growth errors. To mitigate calibration errors, we use rank modulation proposed for flash memory. To correct stumped tail growth errors, we introduce a new family of rank modulation codes that can correct ``stuck-at'' errors. Our analytical results include constructions for order-optimal-redundancy permutation codes and accompanying encoding and decoding algorithms.
Problem

Research questions and friction points this paper is trying to address.

High cost of DNA synthesis limits data storage.
Low data density in DNA backbone nicking.
Errors in DNA tail length encoding.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enzymatically synthesized DNA tails encode data
Staggered nicking-tail extension controls tail lengths
Rank modulation corrects stumped tail errors
🔎 Similar Papers
No similar papers found.
Jin Sima
Jin Sima
University of Illinois Urbana-Champaign
Information theoryMachine LearningTheory of Computing
C
Chao Pan
Google
S
S. K. Tabatabaei
New England BioLabs
A
Alvaro G. Hernandez
Roy J. Carver Biotechnology Center, University of Illinois Urbana-Champaign
C
Charles M. Schroeder
Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, Department of Materials Science and Engineering, University of Illinois Urbana-Champaign
Olgica Milenkovic
Olgica Milenkovic
University of Illinois
AlgorithmsBioinformaticsCoding TheoryMachine Learning