The Synthesis-Sequencing Channel for DNA-based Data Storage

📅 2026-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing DNA storage models struggle to accurately capture the coverage bias and non-independent errors arising jointly during synthesis and sequencing. This work proposes the first unified synthesis–sequencing channel model that distinguishes physical coverage from sequencing coverage and explicitly characterizes the coupling effects between errors in the two stages. Framed within a binary symmetric channel framework, the study employs information-theoretic techniques to derive matching achievability and converse bounds, and under mild parameter assumptions, establishes the channel’s information capacity. The analysis reveals intricate trade-offs among coverage depth and the two error types, thereby determining the maximum achievable rate for reliable DNA-based data storage.
📝 Abstract
We introduce and study the synthesis-sequencing channel, a two-stage model for DNA-based data storage that jointly captures synthesis and sequencing effects. The synthesis-sequencing channel provides a more nuanced and realistic model of the DNA storage process compared to prior work, as it distinguishes between physical coverage after synthesis and sequencing coverage after readout, relaxes the assumption of independent errors across reads, and naturally induces coverage bias through the composition of synthesis and sequencing stages. We establish the information-theoretic capacity of this channel by deriving matching converse and achievability bounds for the case where synthesis and sequencing errors are modeled by binary symmetric channels with possibly different error probabilities, under mild assumptions on the channel parameters. Our results reveal multiple trade-offs between physical coverage, synthesis errors, sequencing coverage, and sequencing errors that influence the maximum achievable rate for reliable data storage.
Problem

Research questions and friction points this paper is trying to address.

DNA-based data storage
synthesis-sequencing channel
coverage bias
error correlation
information capacity
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthesis-sequencing channel
DNA data storage
information-theoretic capacity
coverage bias
binary symmetric channel
🔎 Similar Papers
No similar papers found.