🤖 AI Summary
This paper resolves a fundamental open problem—whether “sample-adaptive adversaries” and “sample-oblivious (ignorant) adversaries” are equivalent in statistical learning under adversarial sample contamination (Blanc et al., COLT’22; Canonne et al., FOCS’23). We establish, for the first time, that under any contamination model, the two adversary classes differ in destructive power by at most a polynomial factor in sample complexity—hence they are computationally equivalent. Methodologically, we propose a robust algorithmic framework based on randomized subsampling and introduce the *sunflower* structure to characterize distributional relationships. Furthermore, we design a generic transformation mechanism that efficiently upgrades any algorithm robust against oblivious adversaries into one robust against adaptive adversaries, with only a polynomial blowup in sample size and no degradation in computational efficiency. This result unifies foundational models of adversarial robustness theory.
📝 Abstract
We resolve a fundamental question about the ability to perform a statistical task, such as learning, when an adversary corrupts the sample. Such adversaries are specified by the types of corruption they can make and their level of knowledge about the sample. The latter distinguishes between sample-adaptive adversaries which know the contents of the sample when choosing the corruption, and sample-oblivious adversaries, which do not. We prove that for all types of corruptions, sample-adaptive and sample-oblivious adversaries are equivalent up to polynomial factors in the sample size. This resolves the main open question introduced by Blanc et al. (COLT, 2022) and further explored in Canonne et al. (FOCS, 2023). Specifically, consider any algorithm A that solves a statistical task even when a sample-oblivious adversary corrupts its input. We show that there is an algorithm A′ that solves the same task when the corresponding sample-adaptive adversary corrupts its input. The construction of A′ is simple and maintains the computational efficiency of A: It requests a polynomially larger sample than A uses and then runs A on a uniformly random subsample. One of our main technical tools is a new structural result relating two distributions defined on sunflowers which may be of independent interest.