🤖 AI Summary
In high-cost sampling domains—such as medical imaging and remote sensing—active object discovery is challenged by partial observability and strict budget constraints. Method: This paper proposes a pretraining-free diffusion-guided belief modeling framework that integrates diffusion dynamics modeling, entropy-driven exploration, incremental reward learning, and Bayesian belief updating to achieve online feedback–driven adaptive balance between exploration and exploitation, while providing interpretable sampling decisions. Contribution/Results: Extensive experiments across multiple domains demonstrate that the method significantly outperforms unsupervised baselines and matches the performance of fully observable supervised approaches, improving sampling efficiency by up to 42%. To our knowledge, this is the first work to incorporate diffusion processes into belief evolution modeling for active discovery, offering both theoretical novelty and practical utility.
📝 Abstract
In various scientific and engineering domains, where data acquisition is costly, such as in medical imaging, environmental monitoring, or remote sensing, strategic sampling from unobserved regions, guided by prior observations, is essential to maximize target discovery within a limited sampling budget. In this work, we introduce Diffusion-guided Active Target Discovery (DiffATD), a novel method that leverages diffusion dynamics for active target discovery. DiffATD maintains a belief distribution over each unobserved state in the environment, using this distribution to dynamically balance exploration-exploitation. Exploration reduces uncertainty by sampling regions with the highest expected entropy, while exploitation targets areas with the highest likelihood of discovering the target, indicated by the belief distribution and an incrementally trained reward model designed to learn the characteristics of the target. DiffATD enables efficient target discovery in a partially observable environment within a fixed sampling budget, all without relying on any prior supervised training. Furthermore, DiffATD offers interpretability, unlike existing black-box policies that require extensive supervised training. Through extensive experiments and ablation studies across diverse domains, including medical imaging and remote sensing, we show that DiffATD performs significantly better than baselines and competitively with supervised methods that operate under full environmental observability.