Gumbel-Softmax Discretization Constraint, Differentiable IDS Channel, and an IDS-Correcting Code for DNA Storage

πŸ“… 2024-07-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Efficient error correction for insertion, deletion, and substitution (IDS) errors remains a major challenge in DNA-based data storage. Method: This paper proposes THEA-codeβ€”the first end-to-end trainable, channel-customized IDS error-correcting code framework. It innovatively integrates Gumbel-Softmax relaxation for discrete code optimization with a differentiable IDS channel model, overcoming the non-differentiability and poor generalizability of conventional constructive coding schemes. Built upon an autoencoder architecture, THEA-code jointly optimizes codeword generation, channel simulation, and decoding. Contribution/Results: Experiments demonstrate that THEA-code significantly improves decoding accuracy and robustness across diverse, realistic IDS channel models. It supports flexible adaptation to varying channel characteristics without manual code redesign, establishing a novel, learnable paradigm for efficient error correction in DNA storage.

Technology Category

Application Category

πŸ“ Abstract
Insertion, deletion, and substitution (IDS) error-correcting codes have garnered increased attention with recent advancements in DNA storage technology. However, a universal method for designing IDS-correcting codes across varying channel settings remains underexplored. We present an autoencoder-based method, THEA-code, aimed at efficiently generating IDS-correcting codes for complex IDS channels. In the work, a Gumbel-Softmax discretization constraint is proposed to discretize the features of the autoencoder, and a simulated differentiable IDS channel is developed as a differentiable alternative for IDS operations. These innovations facilitate the successful convergence of the autoencoder, resulting in channel-customized IDS-correcting codes with commendable performance across complex IDS channels.
Problem

Research questions and friction points this paper is trying to address.

DNA storage
error correction codes
reliability enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

THEA-code
Gumbel-Softmax optimization
DNA information storage
πŸ”Ž Similar Papers
No similar papers found.
Alan J.X. Guo
Alan J.X. Guo
Center for Applied Mathematics, Tianjin Univ.
CombinatoricsDeep Learning
Mengyi Wei
Mengyi Wei
Ph.D. Candidate, Technical University of Munich
AI EthicsData VisualizationHuman-Computer Interaction
Y
Yufan Dai
Center for Applied Mathematics, Tianjin University, China
Y
Yali Wei
Center for Applied Mathematics, Tianjin University, China
P
Pengchen Zhang
Center for Applied Mathematics, Tianjin University, China