🤖 AI Summary
Learning binary codewords in differentiable autoencoders (AEs) is hindered by gradient blocking during discretization, leading to training instability.
Method: We propose a two-stage training strategy that avoids gradient approximation: first pretraining the AE in the continuous domain, followed by direct binarization via hard thresholding and fine-tuning. This approach naturally recovers the linear structure and optimal Hamming distance properties of Hamming codes.
Contribution/Results: Our method successfully learns the optimal (7,4) algebraic error-correcting code end-to-end. Under a binary symmetric channel with maximum-likelihood decoding, the learned code achieves block error rate performance comparable to the classical Hamming code. To our knowledge, this is the first demonstration that a small-scale AE can stably learn high-performance binary codes with strict algebraic structure—without gradient approximation—thereby establishing a novel paradigm for neural communication coding.
📝 Abstract
Error correcting codes play a central role in digital communication, ensuring that transmitted information can be accurately reconstructed despite channel impairments. Recently, autoencoder (AE) based approaches have gained attention for the end-to-end design of communication systems, offering a data driven alternative to conventional coding schemes. However, enforcing binary codewords within differentiable AE architectures remains difficult, as discretization breaks gradient flow and often leads to unstable convergence. To overcome this limitation, a simplified two stage training procedure is proposed, consisting of a continuous pretraining phase followed by direct binarization and fine tuning without gradient approximation techniques. For the (7,4) block configuration over a binary symmetric channel (BSC), the learned encoder-decoder pair learns a rotated version (coset code) of the optimal Hamming code, naturally recovering its linear and distance properties and thereby achieving the same block error rate (BLER) with maximum likelihood (ML) decoding. These results indicate that compact AE architectures can effectively learn structured, algebraically optimal binary codes through stable and straightforward training.