🤖 AI Summary
In the AIGC era, widespread high-fidelity audio generation exacerbates challenges for codec-based audio steganography—namely, reliance on large pretrained models, complex training procedures, and computationally heavy decoders. To address this, we propose a pretrained-model-free steganographic framework featuring a fixed, lightweight neural decoder. Our key contributions are: (1) a novel, structurally invariant, low-complexity neural decoder; and (2) an end-to-end differentiable Audio Perturbation Generation (APG) strategy that optimizes stego perturbations under perceptual and statistical distortion constraints. At the receiver, only the shared fixed decoder is required for high-fidelity secret extraction. Experiments demonstrate that our method outperforms state-of-the-art approaches in steganalysis resistance across multiple payload sizes, achieving an average PSNR gain of over 10 dB for stego-audio—significantly improving auditory quality and statistical indistinguishability.
📝 Abstract
The rapid development of Artificial Intelligence Generated Content (AIGC) has made high-fidelity generated audio widely available across the Internet, offering an abundant and versatile source of cover signals for covert communication. Driven by advances in deep learning, current audio steganography frameworks are mainly based on encoding-decoding network architectures. While these methods greatly improve the security of audio steganography, they typically employ elaborate training workflows and rely on extensive pre-trained models. To address the aforementioned issues, this paper pioneers a Fixed-Decoder Framework for Audio Steganography with Adversarial Perturbation Generation (FGS-Audio). The adversarial perturbations that carry secret information are embedded into cover audio to generate stego audio. The receiver only needs to share the structure and weights of the fixed decoding network to accurately extract the secret information from the stego audio, thus eliminating the reliance on large pre-trained models. In FGS-Audio, we propose an audio Adversarial Perturbation Generation (APG) strategy and design a lightweight fixed decoder. The fixed decoder guarantees reliable extraction of the hidden message, while the adversarial perturbations are optimized to keep the stego audio perceptually and statistically close to the cover audio, thereby improving resistance to steganalysis. The experimental results show that the method exhibits excellent anti-steganalysis performance under different relative payloads, outperforming existing SOTA approaches. In terms of stego audio quality, FGS-Audio achieves an average PSNR improvement of over 10 dB compared to SOTA method.