ImF: Implicit Fingerprint for Large Language Models

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing LLM fingerprinting methods exhibit weak semantic relevance and are vulnerable to Generation Revision Intervention (GRI) attacks, which can erase fingerprints and compromise model intellectual property protection. To address this, we propose the Implicit Fingerprinting (ImF) paradigm—the first to leverage semantically strong, implicitly paired fingerprints naturally embedded within question-answering behavior. ImF ensures behavioral consistency, perceptual indistinguishability, detection resistance, and erasure robustness. Our approach comprises four core components: formal modeling of GRI attacks, construction of semantically aligned question-answer pairs, implicit fingerprint injection, and a robust verification mechanism. Extensive experiments across multiple mainstream LLMs demonstrate that ImF achieves significantly higher fingerprint verification success rates than state-of-the-art baselines under diverse adversarial settings—including GRI, paraphrasing, and prompt engineering—while maintaining high robustness and practical deployability.

Technology Category

Application Category

📝 Abstract

Training large language models (LLMs) is resource-intensive and expensive, making intellectual property (IP) protection essential. Most existing model fingerprint methods inject fingerprints into LLMs to protect model ownership. These methods create fingerprint pairs with weak semantic correlations, lacking the contextual coherence and semantic relatedness founded in normal question-answer (QA) pairs in LLMs. In this paper, we propose a Generation Revision Intervention (GRI) attack that can effectively exploit this flaw to erase fingerprints, highlighting the need for more secure model fingerprint methods. Thus, we propose a novel injected fingerprint paradigm called Implicit Fingerprints (ImF). ImF constructs fingerprint pairs with strong semantic correlations, disguising them as natural QA pairs within LLMs. This ensures the fingerprints are consistent with normal model behavior, making them indistinguishable and robust against detection and removal. Our experiment on multiple LLMs demonstrates that ImF retains high verification success rates under adversarial conditions, offering a reliable solution for protecting LLM ownership.

Problem

Research questions and friction points this paper is trying to address.

Protecting intellectual property of resource-intensive large language models

Addressing weak semantic correlations in existing fingerprint methods

Ensuring robust and indistinguishable fingerprints against adversarial attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates semantically correlated fingerprint pairs

Disguises fingerprints as natural QA pairs

Ensures robustness against adversarial detection

🔎 Similar Papers

A Fingerprint for Large Language Models