LOTTERY: Learning from Reference-Only Samples in Two-Sample Testing under Size Asymmetry

📅 2026-06-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the failure of conventional two-sample tests in extreme imbalance scenarios where reference samples vastly outnumber query samples, rendering data splitting ineffective. To overcome this limitation, the authors propose a novel reference-dependent representation learning approach that leverages abundant reference data to construct a multi-scale family of representations capturing both global and local structures. An uncertainty-guided adaptive weighting mechanism is introduced to achieve high statistical power without partitioning the reference set. By integrating permutation testing with kernelized feature representations, the method rigorously controls Type I error rates and enjoys theoretical guarantees of test consistency. Empirical evaluations demonstrate that the proposed approach significantly outperforms existing methods across multiple benchmark datasets.

📝 Abstract

Data-adaptive two-sample testing assesses if two samples come from the same distribution, using a discrepancy learned from the data (e.g., via kernel-based feature representations). Such methods typically rely on data splitting to decouple learning from testing and control type I error. However, this paradigm is ill-suited to few-shot settings with severe sample-size imbalance: abundant reference samples are available, while only a handful of query samples arrive. In this paper, we show how this imbalance can be leveraged constructively. Using abundant reference data, we learn reference-dependent representations that summarize salient structure of the reference distribution and provide informative signals for detecting departures. We incorporate a collection of representation families that capture both global and local structure, and adaptively weight them using only reference samples via an uncertainty-guided principle. Theoretically, we establish permutation-based type I error control and show consistency of the aggregated test: as the sample sizes grow, the test power converges to one whenever the representation set contains at least one consistent representation. Empirically, our aggregation achieves strong performance across a range of benchmarks while retaining type I error control.

Problem

Research questions and friction points this paper is trying to address.

two-sample testing

sample-size asymmetry

few-shot setting

reference-only samples

distribution comparison

Innovation

Methods, ideas, or system contributions that make the work stand out.

two-sample testing

sample size asymmetry

reference-dependent representation