Generalization Bounds via Meta-Learned Model Representations: PAC-Bayes and Sample Compression Hypernetworks

📅 2024-10-17

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the lack of theoretical generalization guarantees in meta-learning by proposing a hypernetwork-based meta-learning framework. It establishes the first non-vacuous, computable, and tight generalization bound via a novel unification of PAC-Bayesian and sample compression theories. Key contributions include: (1) three new encoder designs—PAC-Bayesian latent distribution encoders, discrete sample compression selectors, and hybrid sample compression encoders supporting continuous messages; (2) the first sample compression theorem applicable to continuous message spaces; and (3) instance-wise generalization guarantees for each meta-generated predictor, achieved by integrating variational inference with information bottleneck analysis. Empirical evaluation on standard meta-learning benchmarks confirms the bound’s non-vacuity and practical effectiveness.

Technology Category

Application Category

📝 Abstract

PAC-Bayesian and Sample Compress learning frameworks are instrumental for deriving tight (non-vacuous) generalization bounds for neural networks. We leverage these results in a meta-learning scheme, relying on a hypernetwork that outputs the parameters of a downstream predictor from a dataset input. The originality of our approach lies in the investigated hypernetwork architectures that encode the dataset before decoding the parameters: (1) a PAC-Bayesian encoder that expresses a posterior distribution over a latent space, (2) a Sample Compress encoder that selects a small sample of the dataset input along with a message from a discrete set, and (3) a hybrid between both approaches motivated by a new Sample Compress theorem handling continuous messages. The latter theorem exploits the pivotal information transiting at the encoder-decoder junction to compute generalization guarantees for each downstream predictor obtained by our meta-learning scheme.

Problem

Research questions and friction points this paper is trying to address.

Derive tight generalization bounds for neural networks.

Meta-learning with hypernetwork for predictor parameters.

Explore PAC-Bayesian and Sample Compress encoder architectures.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning with hypernetworks

PAC-Bayesian encoder architecture

Sample Compress encoder integration

🔎 Similar Papers

Sample Compression Unleashed: New Generalization Bounds for Real Valued Losses