🤖 AI Summary
Existing kernel-based goodness-of-fit (GoF) tests fail under both qualitative and quantitative robustness, and robustification strategies—such as tilted kernels—cannot simultaneously satisfy both criteria in GoF testing. Method: Addressing the practical question “Is the model sufficiently accurate?”, we propose the first robust GoF testing framework based on a kernel Stein discrepancy (KSD) ball. This framework rigorously formalizes robust GoF testing and theoretically establishes that conventional kernel tests—and their tilted-kernel variants—lack dual robustness. Contribution/Results: Our test achieves both stability and statistical power under diverse contamination models—including Huber contamination and density bands—enabling unified robust modeling. Empirical evaluation confirms its effectiveness in finite-sample settings, resolving a long-standing challenge in designing robust kernel-based GoF tests.
📝 Abstract
Goodness-of-fit testing is often criticized for its lack of practical relevance: since ``all models are wrong'', the null hypothesis that the data conform to our model is ultimately always rejected as sample size grows. Despite this, probabilistic models are still used extensively, raising the more pertinent question of whether the model is emph{good enough} for the task at hand. This question can be formalized as a robust goodness-of-fit testing problem by asking whether the data were generated from a distribution that is a mild perturbation of the model. In this paper, we show that existing kernel goodness-of-fit tests are not robust under common notions of robustness including both qualitative and quantitative robustness. We further show that robustification techniques using tilted kernels, while effective in the parameter estimation literature, are not sufficient to ensure both types of robustness in the testing setting. To address this, we propose the first robust kernel goodness-of-fit test, which resolves this open problem by using kernel Stein discrepancy (KSD) balls. This framework encompasses many well-known perturbation models, such as Huber's contamination and density-band models.