Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection

📅 2024-09-24
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
In high-cost or high-risk settings where hyperparameter selection is constrained by limited test budgets and demanding statistical reliability requirements, this paper proposes the adaptive Learn-then-Test (aLTT) framework. Methodologically, aLTT introduces e-process theory—novelly applied to sequential, data-dependent multiple hypothesis testing—to enable rigorous false discovery rate (FDR) control with provably valid early stopping. Evaluated on offline reinforcement learning policy selection and large language model prompt optimization, aLTT achieves comparable statistical guarantees and final performance to classical LTT, while reducing the number of required test rounds by an order of magnitude. By unifying finite-sample statistical rigor with practical engineering efficiency, aLTT establishes a provably reliable, sample-efficient paradigm for AI model evaluation under resource constraints.

Technology Category

Application Category

📝 Abstract
We introduce adaptive learn-then-test (aLTT), an efficient hyperparameter selection procedure that provides finite-sample statistical guarantees on the population risk of AI models. Unlike the existing learn-then-test (LTT) technique, which relies on conventional p-value-based multiple hypothesis testing (MHT), aLTT implements sequential data-dependent MHT with early termination by leveraging e-processes. As a result, aLTT can reduce the number of testing rounds, making it particularly well-suited for scenarios in which testing is costly or presents safety risks. Apart from maintaining statistical validity, in applications such as online policy selection for offline reinforcement learning and prompt engineering, aLTT is shown to achieve the same performance as LTT while requiring only a fraction of the testing rounds.
Problem

Research questions and friction points this paper is trying to address.

Hyperparameter Optimization
Limited Data Samples
Statistical Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

aLTT
hyperparameter selection
reduced testing
🔎 Similar Papers
No similar papers found.