A Martingale Kernel Two-Sample Test

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

In kernel two-sample testing via Maximum Mean Discrepancy (MMD), the null distribution lacks a closed-form expression, necessitating computationally expensive permutation or bootstrap calibration. To address this, we propose martingale-based MMD (mMMD), the first method to model the estimated squared MMD as a martingale process. Under the null hypothesis, mMMD is asymptotically standard normal, enabling analytic p-value computation without resampling. Retaining quadratic time complexity O(n²), mMMD achieves statistical consistency and high computational efficiency: its test power approaches that of permutation testing in large samples, while reducing computational cost from O(Bn²) to O(n²), where B is the number of permutations. This bridges theoretical rigor and practical deployability—offering both asymptotic guarantees and scalable inference for real-world applications.

Technology Category

Application Category

📝 Abstract

The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches for calibration. In this work we leverage a martingale interpretation of the estimated squared MMD to propose martingale MMD (mMMD), a quadratic-time statistic which has a limiting standard Gaussian distribution under the null. Moreover we show that the test is consistent against any fixed alternative and for large sample sizes, mMMD offers substantial computational savings over the standard MMD test, with only a minor loss in power.

Problem

Research questions and friction points this paper is trying to address.

Proposes martingale MMD for efficient two-sample testing

Addresses intractable null distribution of standard MMD statistic

Provides computationally faster alternative with Gaussian null distribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Martingale-based MMD statistic with Gaussian null distribution

Quadratic-time algorithm for computational efficiency

Consistent test performance with reduced power loss

🔎 Similar Papers

Practical Kernel Tests of Conditional Independence