RealCamo: Boosting Real Camouflage Synthesis with Layout Controls and Textual-Visual Guidance

📅 2025-12-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

161K/year
🤖 AI Summary
Existing camouflaged image generation methods suffer from low visual realism and semantic inconsistency between background and foreground, severely limiting downstream camouflaged object detection (COD) performance. To address these issues, we propose the first layout-controllable, text-image joint-guided generation framework. Our method introduces a fine-grained, text-driven unified out-painting architecture that integrates texture-oriented background retrieval with explicit layout control. We further propose the Background-Foreground Distribution Divergence (BPDD) metric—the first quantitative measure for evaluating camouflage effectiveness. By preserving semantic coherence, our approach significantly enhances both visual fidelity and camouflage imperceptibility. Extensive experiments demonstrate state-of-the-art COD performance across multiple benchmarks. Both qualitative visualizations and quantitative evaluations validate the efficacy and superiority of our framework.

Technology Category

Application Category

📝 Abstract
Camouflaged image generation (CIG) has recently emerged as an efficient alternative for acquiring high-quality training data for camouflaged object detection (COD). However, existing CIG methods still suffer from a substantial gap to real camouflaged imagery: generated images either lack sufficient camouflage due to weak visual similarity, or exhibit cluttered backgrounds that are semantically inconsistent with foreground targets. To address these limitations, we propose ReamCamo, a unified out-painting based framework for realistic camouflaged image generation. ReamCamo explicitly introduces additional layout controls to regulate global image structure, thereby improving semantic coherence between foreground objects and generated backgrounds. Moreover, we construct a multi-modal textual-visual condition by combining a unified fine-grained textual task description with texture-oriented background retrieval, which jointly guides the generation process to enhance visual fidelity and realism. To quantitatively assess camouflage quality, we further introduce a background-foreground distribution divergence metric that measures the effectiveness of camouflage in generated images. Extensive experiments and visualizations demonstrate the effectiveness of our proposed framework.
Problem

Research questions and friction points this paper is trying to address.

Generates realistic camouflaged images with layout controls
Enhances visual fidelity using textual-visual guidance
Measures camouflage quality via distribution divergence metric
Innovation

Methods, ideas, or system contributions that make the work stand out.

Layout controls regulate global image structure
Multi-modal textual-visual condition enhances visual fidelity
Background-foreground divergence metric assesses camouflage quality
🔎 Similar Papers
C
Chunyuan Chen
College of Artificial Intelligence, Nankai University, Tianjin, China
Y
Yunuo Cai
School of Data Science, Fudan University, Shanghai, China
Shujuan Li
Shujuan Li
Tsinghua University
3D Gaussian SplattingSurface ReconstructionPoint Cloud
W
Weiyun Liang
College of Artificial Intelligence, Nankai University, Tianjin, China
B
Bin Wang
College of Artificial Intelligence, Nankai University, Tianjin, China
J
Jing Xu
College of Artificial Intelligence, Nankai University, Tianjin, China