🤖 AI Summary
This work addresses the lack of theoretical generalization guarantees in meta-learning by proposing a hypernetwork-based meta-learning framework. It establishes the first non-vacuous, computable, and tight generalization bound via a novel unification of PAC-Bayesian and sample compression theories. Key contributions include: (1) three new encoder designs—PAC-Bayesian latent distribution encoders, discrete sample compression selectors, and hybrid sample compression encoders supporting continuous messages; (2) the first sample compression theorem applicable to continuous message spaces; and (3) instance-wise generalization guarantees for each meta-generated predictor, achieved by integrating variational inference with information bottleneck analysis. Empirical evaluation on standard meta-learning benchmarks confirms the bound’s non-vacuity and practical effectiveness.
📝 Abstract
PAC-Bayesian and Sample Compress learning frameworks are instrumental for deriving tight (non-vacuous) generalization bounds for neural networks. We leverage these results in a meta-learning scheme, relying on a hypernetwork that outputs the parameters of a downstream predictor from a dataset input. The originality of our approach lies in the investigated hypernetwork architectures that encode the dataset before decoding the parameters: (1) a PAC-Bayesian encoder that expresses a posterior distribution over a latent space, (2) a Sample Compress encoder that selects a small sample of the dataset input along with a message from a discrete set, and (3) a hybrid between both approaches motivated by a new Sample Compress theorem handling continuous messages. The latter theorem exploits the pivotal information transiting at the encoder-decoder junction to compute generalization guarantees for each downstream predictor obtained by our meta-learning scheme.