🤖 AI Summary
This paper addresses the challenge of poor generalization in zero-shot change detection (ZS-CD) to unseen change types and data distributions. To this end, we propose AnyChange—a novel framework centered on “bitemporal latent-space matching,” an adaptive, inference-time mechanism that requires no fine-tuning or additional training and enables object-level zero-shot detection via point queries. Leveraging SAM’s cross-image semantic latent space, AnyChange exploits intrinsic pixel-wise semantic similarity without relying on labeled data or supervised training. On the SECOND benchmark, AnyChange establishes new state-of-the-art performance under fully unsupervised settings, improving the F₁ score by 4.4%. Remarkably, with only one pixel-level annotation per image, it matches fully supervised performance—substantially reducing annotation overhead and broadening practical deployability.
📝 Abstract
Visual foundation models have achieved remarkable results in zero-shot image classification and segmentation, but zero-shot change detection remains an open problem. In this paper, we propose the segment any change models (AnyChange), a new type of change detection model that supports zero-shot prediction and generalization on unseen change types and data distributions. AnyChange is built on the segment anything model (SAM) via our training-free adaptation method, bitemporal latent matching. By revealing and exploiting intra-image and inter-image semantic similarities in SAM's latent space, bitemporal latent matching endows SAM with zero-shot change detection capabilities in a training-free way. We also propose a point query mechanism to enable AnyChange's zero-shot object-centric change detection capability. We perform extensive experiments to confirm the effectiveness of AnyChange for zero-shot change detection. AnyChange sets a new record on the SECOND benchmark for unsupervised change detection, exceeding the previous SOTA by up to 4.4% F$_1$ score, and achieving comparable accuracy with negligible manual annotations (1 pixel per image) for supervised change detection. Code is available at https://github.com/Z-Zheng/pytorch-change-models.