A Martingale Kernel Two-Sample Test

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In kernel two-sample testing via Maximum Mean Discrepancy (MMD), the null distribution lacks a closed-form expression, necessitating computationally expensive permutation or bootstrap calibration. To address this, we propose martingale-based MMD (mMMD), the first method to model the estimated squared MMD as a martingale process. Under the null hypothesis, mMMD is asymptotically standard normal, enabling analytic p-value computation without resampling. Retaining quadratic time complexity O(n²), mMMD achieves statistical consistency and high computational efficiency: its test power approaches that of permutation testing in large samples, while reducing computational cost from O(Bn²) to O(n²), where B is the number of permutations. This bridges theoretical rigor and practical deployability—offering both asymptotic guarantees and scalable inference for real-world applications.

Technology Category

Application Category

📝 Abstract
The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches for calibration. In this work we leverage a martingale interpretation of the estimated squared MMD to propose martingale MMD (mMMD), a quadratic-time statistic which has a limiting standard Gaussian distribution under the null. Moreover we show that the test is consistent against any fixed alternative and for large sample sizes, mMMD offers substantial computational savings over the standard MMD test, with only a minor loss in power.
Problem

Research questions and friction points this paper is trying to address.

Proposes martingale MMD for efficient two-sample testing
Addresses intractable null distribution of standard MMD statistic
Provides computationally faster alternative with Gaussian null distribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Martingale-based MMD statistic with Gaussian null distribution
Quadratic-time algorithm for computational efficiency
Consistent test performance with reduced power loss
🔎 Similar Papers
No similar papers found.