🤖 AI Summary
This work addresses the challenge of dynamic pricing under resource constraints, where conventional controllers may render neighborhoods of the target price infeasible, leading to support set exclusion and invalidating fixed-price inference. To overcome this, the authors propose a target-aware adaptive pricing mechanism that formally characterizes support exclusion, employs an information clock model to analyze local non-identifiability, and integrates local density estimation with debiasing techniques to construct studentized confidence intervals. The method achieves polynomial convergence rates without requiring additional exploration, and the authors theoretically establish a polynomial relationship between target quality and convergence rate. Empirical results demonstrate that the resulting confidence intervals are well-calibrated and that the procedure can proactively abstain from inference when support collapse occurs, thereby preserving reliability.
📝 Abstract
Resource-constrained pricing controllers can make fixed-price inference impossible: the controller's resource state may remove the target price neighborhood from the feasible set, even when every realized action has a known positive density. We formalize this support-exclusion failure through a local non-identification result and a realized information clock. We then design a target-aware pricing controller that certifies feasible target bands and logs continuous local densities. Localized debiasing gives studentized intervals whose width is governed by this clock. The resulting regret--information accounting, stated up to pilot re-solving error, shows that cheap exploration can be insufficient for inference: polynomial target mass gives polynomial rates, while a pure $1/t$ target branch does not yield shrinking fixed-target intervals without additional local movement. Experiments show calibration in certified bands and diagnostic abstention when the resource state collapses target support.