🤖 AI Summary
This work addresses a critical gap in current large language model (LLM) identification methods, which primarily focus on model-level provenance and fail to distinguish behavioral differences arising from instance-level configurations—such as prompts, sampling strategies, or quantization schemes—thereby falling short of regulatory requirements for assessing real-world deployment compliance. To bridge this gap, the paper introduces a regulation-oriented, instance-level fingerprinting paradigm that advances identification granularity from the model level to the instance level. The approach extracts fingerprints by analyzing inherent statistical biases in pseudo-random binary sequences generated by LLMs and establishes both closed-set and open-set recognition frameworks. Evaluated across 237 model instances, the method achieves 96% accuracy in closed-set and 90% in open-set scenarios, substantially outperforming baseline approaches (35%) and offering regulators a robust, actionable technical pathway resilient to configuration variations.
📝 Abstract
Literature reveals that a Large Language Model's (LLM) behavior is not only conditioned by its original weights but also its instance-level parameters, such as instructional prompt, sampling configuration or quantization. A model that generates safe outputs under one configuration may produce toxic content under another. However, current LLM identification techniques (such as fingerprinting) focus on intellectual property protection, and their design favors robustness to changes in these instance-level parameters. This poses a critical challenge for AI regulation in which compliance assessments target actual deployed behaviors, not model provenance. In this paper, we introduce instance-level fingerprinting, a regulator-oriented paradigm that distinguishes configurations of the same LLM. Our method FLIPS, exploits biases in generated binary random sequences to reach 96% (closed-set) and 90% (open-set, where some targets are unknown) identification accuracy across 237 model instances, versus 35% for the adapted LLMmap baseline. This shows that instance-level fingerprinting is both necessary for regulation and practically feasible. Code available at https://github.com/GurvanR/FLIPS-LLM-Instance-Fingerprinting.