SEArch: Optimistic Policy Selection Between Scene Noise and Drift for UAV Radar Search

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work addresses the performance degradation of UAV radar signal processing in dynamic non-stationary environments, where fixed strategies suffer from scene-dependent noise and statistical drift, while onboard resource constraints hinder real-time adaptation. The authors formulate target search as an online policy selection problem, dynamically choosing the best detector from a library to minimize cumulative regret. To this end, they introduce the Stochastically Extended Adversary (SEA) framework, which unifies intra-scene stochasticity and inter-scene drift, and propose SEArch—a lightweight, optimistic Follow-the-Regularized-Leader (FTRL)-based selector—and its sliding-window variant, W-SEArch, both operating without prior knowledge of environmental dynamics. Integrated with radar micro-Doppler feature detection and an adaptive learning rate mechanism, the approach reduces cumulative regret by up to 30% compared to non-adaptive baselines across diverse non-stationary scenarios, demonstrating strong online adaptability.

📝 Abstract

Unmanned Aerial Vehicles (UAVs) equipped with radar sensors are deployed for target search missions in diverse environments, where targets exhibit characteristic signatures (e.g., respiration micro-motion in human search) detectable through occlusions. A fundamental challenge arises from shifts in radar statistics as the UAV moves through a dynamic and potentially non-stationary environment, rendering any fixed signal-processing strategy suboptimal; yet perception and adaptation must run onboard a resource-constrained aerial node in real time. Since no single detector performs well across all conditions, we adopt a multi-policy paradigm and formulate UAV target search as an online policy selection problem over a library of specialized detectors, with performance measured by regret, the cumulative loss gap relative to the best policy in each scene. The setting couples in-scene stochastic noise with inter-scene shifts. Whereas prior methods capture only one regime, we account for both through the Stochastically Extended Adversary (SEA) framework, without requiring oracle knowledge of scene dynamics. Because adaptation must run at the UAV, we instantiate SEA through \textsc{SEArch}, a lightweight optimistic Follow the Regularized Leader (OFTRL) selector with an adaptive learning rate, achieving regret $O(\barσ_T \sqrt{T} + \sqrt{J})$, where $\barσ_T$ captures radar measurement noise and $J$ is the number of scene transitions over the mission horizon $T$. To enable rapid adaptation under frequent scene changes, we further introduce \textsc{W-SEArch}, a windowed variant that restarts every $w$ rounds and achieves regret $O(\barσ_I \sqrt{w})$ under at most one transition per window. Experiments show up to 30\% regret reduction compared to non-adaptive baselines across a range of non-stationary settings.

Problem

Research questions and friction points this paper is trying to address.

UAV radar search

non-stationary environment

scene drift

stochastic noise

online policy selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

online policy selection

non-stationary environment

radar search