🤖 AI Summary
For source localization in dynamic, partially observable environments with sparse rewards—e.g., gas leak detection—this paper proposes a novel framework integrating Bayesian inference with hierarchical reinforcement learning. Methodologically, it introduces (1) an attention-enhanced particle filter for efficient uncertainty modeling of time- and space-varying states, with theoretical convergence guarantees; and (2) a dual-execution “planning–learning” strategy that jointly optimizes exploration robustness and decision efficiency. The approach significantly improves localization accuracy and computational efficiency under sparse, noisy observations. Extensive experiments across diverse scenarios demonstrate strong cross-distribution generalization and environmental adaptability. By unifying probabilistic state estimation with hierarchical policy learning, the framework establishes an interpretable, scalable paradigm for low signal-to-noise-ratio dynamic source localization.
📝 Abstract
In many real-world scenarios, such as gas leak detection or environmental pollutant tracking, solving the Inverse Source Localization and Characterization problem involves navigating complex, dynamic fields with sparse and noisy observations. Traditional methods face significant challenges, including partial observability, temporal and spatial dynamics, out-of-distribution generalization, and reward sparsity. To address these issues, we propose a hierarchical framework that integrates Bayesian inference and reinforcement learning. The framework leverages an attention-enhanced particle filtering mechanism for efficient and accurate belief updates, and incorporates two complementary execution strategies: Attention Particle Filtering Planning and Attention Particle Filtering Reinforcement Learning. These approaches optimize exploration and adaptation under uncertainty. Theoretical analysis proves the convergence of the attention-enhanced particle filter, while extensive experiments across diverse scenarios validate the framework's superior accuracy, adaptability, and computational efficiency. Our results highlight the framework's potential for broad applications in dynamic field estimation tasks.