🤖 AI Summary
This work addresses a critical integrity conflict in digital content authentication, where cryptographic provenance frameworks (e.g., C2PA) and invisible watermarking operate independently, potentially labeling the same asset as both human-created and AI-generated. We formally characterize and empirically demonstrate this cross-layer authentication inconsistency for the first time. To resolve it without modifying existing standards, we propose a cross-audit protocol that jointly evaluates provenance metadata and watermark detection status. By constructing a metadata sanitization workflow under standard editing operations and incorporating image perturbations, our approach achieves 100% classification accuracy across 3,500 test images spanning four conflict scenarios and three perturbation conditions, effectively bridging the semantic gap between verification layers.
📝 Abstract
Cryptographic provenance standards such as C2PA and invisible watermarking are positioned as complementary defenses for content authentication, yet the two verification layers are technically independent: neither conditions on the output of the other. This work formalizes and empirically demonstrates the $\textit{Integrity Clash}$, a condition in which a digital asset carries a cryptographically valid C2PA manifest asserting human authorship while its pixels simultaneously carry a watermark identifying it as AI-generated, with both signals passing their respective verification checks in isolation. We construct metadata washing workflows that produce these authenticated fakes through standard editing pipelines, requiring no cryptographic compromise, only the semantic omission of a single assertion field permitted by the current C2PA specification. To close this gap, we propose a cross-layer audit protocol that jointly evaluates provenance metadata and watermark detection status, achieving 100% classification accuracy across 3,500 test images spanning four conflict-matrix states and three realistic perturbation conditions. Our results demonstrate that the gap between these verification layers is unnecessary and technically straightforward to close.