🤖 AI Summary
To address the low query efficiency of ray-search-based black-box attacks under hard-label constraints, this paper proposes a prior-guided ray optimization framework. The method leverages a surrogate model to generate transferable priors, which are fused with random search directions to construct a direction-aware gradient estimator; theoretical analysis proves that such priors enhance the gradient direction projection gain, significantly improving estimation accuracy. Furthermore, a sign-assisted binary search strategy is integrated to transform discrete label queries into efficient continuous optimization. Evaluated on ImageNet and CIFAR-10 against 11 state-of-the-art baselines, the approach reduces average query counts by 52% and accelerates convergence by 2.3×. It is the first method to achieve high-precision, adaptive ray-direction optimization under hard-label constraints while effectively incorporating prior knowledge.
📝 Abstract
One of the most practical and challenging types of black-box adversarial attacks is the hard-label attack, where only the top-1 predicted label is available. One effective approach is to search for the optimal ray direction from the benign image that minimizes the $ell_p$-norm distance to the adversarial region. The unique advantage of this approach is that it transforms the hard-label attack into a continuous optimization problem. The objective function value is the ray's radius, which can be obtained via binary search at a high query cost. Existing methods use a "sign trick" in gradient estimation to reduce the number of queries. In this paper, we theoretically analyze the quality of this gradient estimation and propose a novel prior-guided approach to improve ray search efficiency both theoretically and empirically. Specifically, we utilize the transfer-based priors from surrogate models, and our gradient estimators appropriately integrate them by approximating the projection of the true gradient onto the subspace spanned by these priors and random directions, in a query-efficient manner. We theoretically derive the expected cosine similarities between the obtained gradient estimators and the true gradient, and demonstrate the improvement achieved by incorporating priors. Extensive experiments on the ImageNet and CIFAR-10 datasets show that our approach significantly outperforms 11 state-of-the-art methods in terms of query efficiency.