๐ค AI Summary
To address inaccurate concept matching in out-of-distribution (OOD) detection, this paper proposes the Concept Matching Agent (CMA) framework. CMA introduces neutral prompts as generalizable, dynamic proxies and establishes a triangular vector relationship among input images, in-distribution (ID) class labels, and neutral prompts within the CLIP embedding spaceโreplacing conventional binary contrast with fine-grained geometric reasoning for ID/OOD discrimination. This is the first work to integrate neutral prompt engineering with vector-space geometric modeling for OOD detection. The method requires no fine-tuning or additional training, relying solely on zero-shot transfer and cross-modal semantic alignment. Evaluated on multiple real-world benchmarks, CMA achieves an average OOD detection accuracy improvement of 5.2โ11.8% over both zero-shot and supervised baselines, demonstrating significantly enhanced robustness and cross-domain generalization capability.
๐ Abstract
The remarkable achievements of Large Language Models (LLMs) have captivated the attention of both academia and industry, transcending their initial role in dialogue generation. To expand the usage scenarios of LLM, some works enhance the effectiveness and capabilities of the model by introducing more external information, which is called the agent paradigm. Based on this idea, we propose a new method that integrates the agent paradigm into out-of-distribution (OOD) detection task, aiming to improve its robustness and adaptability. Our proposed method, Concept Matching with Agent (CMA), employs neutral prompts as agents to augment the CLIP-based OOD detection process. These agents function as dynamic observers and communication hubs, interacting with both In-distribution (ID) labels and data inputs to form vector triangle relationships. This triangular framework offers a more nuanced approach than the traditional binary relationship, allowing for better separation and identification of ID and OOD inputs. Our extensive experimental results showcase the superior performance of CMA over both zero-shot and training-required methods in a diverse array of real-world scenarios.