🤖 AI Summary
This work addresses a fundamental limitation of the traditional Smart PAC learning framework, which fails to provide verifiable guarantees for semi-supervised learning when the marginal distributions are indistinguishable, rendering it impractical in real-world settings. To circumvent this impossibility result, the paper introduces the “Optimally-Verifiable Intelligent Learning” (OIG) framework, which relaxes the learning objective to compete only against the best verifiably optimal semi-supervised learner. Theoretical analysis demonstrates that, in a distribution-free setting, an OIG learner can achieve relative intelligence with sample complexity quadratic in the relevant parameters. Moreover, the study reveals that under certain distribution families, the task may remain infeasible or require specialized treatment, and uncovers a non-monotonic relationship between learning difficulty and the structure of these distribution families.
📝 Abstract
We revisit the framework of Smart PAC learning, which seeks supervised learners which compete with semi-supervised learners that are provided full knowledge of the marginal distribution on unlabeled data. Prior work has shown that such marginal-by-marginal guarantees are possible for "most" marginals, with respect to an arbitrary fixed and known measure, but not more generally. We discover that this failure can be attributed to an "indistinguishability" phenomenon: There are marginals which cannot be statistically distinguished from other marginals that require different learning approaches. In such settings, semi-supervised learning cannot certify its guarantees from unlabeled data, rendering them arguably non-actionable.
We propose relatively smart learning, a new framework which demands that a supervised learner compete only with the best "certifiable" semi-supervised guarantee. We show that such modest relaxation suffices to bypass the impossibility results from prior work. In the distribution-free setting, we show that the OIG learner is relatively smart up to squaring the sample complexity, and show that no supervised learning algorithm can do better. For distribution-family settings, we show that relatively smart learning can be impossible or can require idiosyncratic learning approaches, and its difficulty can be non-monotone in the inclusion order on distribution families.