🤖 AI Summary
DNA storage suffers from low efficiency in random access, necessitating minimization of the expected number of strand samplings required to retrieve a target information strand.
Method: We propose a geometric coding-theoretic framework, introducing for the first time the notion of “balanced quasi-arcs” to characterize random-access performance.
Contribution/Results: We construct an explicit code achieving the optimal expected sampling number for (k = 3), significantly outperforming prior schemes. Moreover, we rigorously prove that rate-(1/2) codes attain the conjectured lower bound on expected sampling number for random access in arbitrary dimensions—thereby resolving, for the first time in full generality, a long-standing open conjecture originally posed in [2]. Our work establishes a novel constructive paradigm and provides rigorous theoretical foundations for efficient random access in DNA-based storage systems.
📝 Abstract
Effective and reliable data retrieval is critical for the feasibility of DNA storage, and the development of random access efficiency plays a key role in its practicality and reliability. In this paper, we study the Random Access Problem, which asks to compute the expected number of samples one needs in order to recover an information strand. Unlike previous work, we took a geometric approach to the problem, aiming to understand which geometric structures lead to codes that perform well in terms of reducing the random access expectation (Balanced Quasi-Arcs). As a consequence, two main results are obtained. The first is a construction for $k=3$ that outperforms previous constructions aiming to reduce the random access expectation. The second, exploiting a result from [1], is the proof of a conjecture from [2] for rate $1/2$ codes in any dimension.