🤖 AI Summary
This work addresses the scarcity of diverse training data for autonomous vessel perception systems under adverse weather and low-light conditions, compounded by the difficulty of acquiring paired maritime images. To overcome these challenges, the authors propose a one-step unpaired image translation framework based on CycleGAN-turbo, innovatively incorporating zero-convolution skip connections to bypass the VAE bottleneck. This design explicitly preserves structural details of small, distant objects—such as vessels and navigational aids—during the synthesis of foggy, sunset, and nighttime maritime scenes. Models trained on 7,000 maritime images (Day-to-Foggy and Day-to-Sunset) demonstrate strong structure-preserving capabilities, while the Day-to-Night variant reveals semantic hallucinations stemming from imbalanced training distributions. Collectively, the approach establishes an efficient, structure-aware pipeline for synthetic maritime data generation.
📝 Abstract
The development on robust perception systems for Maritime Autonomous Surface Ships (MASS) is heavily constrained by the scarcity of diverse training data, particularly for adverse weather and low-light conditions. Because collecting paired images in dynamic maritime environments is physically impossible, synthetic data generation via unpaired image-to-image translation offers a critical solution. However, existing generative models suffer from failing to preserve the fine structural details of small navigational objects due to latent compression bottlenecks. In this paper, we introduce a framework for generating synthetic maritime data using CycleGAN-turbo, a one-step unpaired translation architecture. By incorporating zero-convolution skip connections to bypass the Variational Autoencoder (VAE) bottleneck, our approach explicitly preserves small object details (e.g., distant vessels and sea marks) during translation. We compiled a dataset of 7,000 maritime images to train and evaluate models for Day-to-Foggy, Day-to-Sunset, and Day-to-Night domain translations. Qualitative evaluations and variable-strength inference studies demonstrate that our method effectively synthesizes realistic atmospheric conditions while maintaining the underlying semantic structure of the scene. The Day-to-Foggy and Day-to-Sunset models exhibit great structural retention, whereas the Day-to-Night model highlights the challenge of semantic hallucination, such as generating artificial coastal lights, induced by unbalanced training distributions. Ultimately, this work establishes an efficient, structure-aware data synthesis pipeline that directly addresses the data scarcity bottleneck in autonomous maritime navigation.