Parametric convergence rate of a non-parametric estimator in multivariate mixtures of power series distributions under conditional independence

📅 2025-09-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses nonparametric estimation and goodness-of-fit testing for multivariate power-series mixture models—specifically Poisson, geometric, and negative binomial—with infinite support, under the conditional independence assumption. To overcome the suboptimal logarithmic-polynomial convergence rates of existing methods, we propose a novel nonparametric maximum likelihood estimator achieving the parametric rate (n^{-1/2}) uniformly across all (ell_p) distances. Concurrently, we develop a conditional independence test based on the Hellinger distance and mixture decomposition. Theoretical analysis establishes consistency and optimal rates; extensive simulations corroborate these findings. Empirical evaluation on Paris Vélib bike-sharing data confirms the estimator’s (n^{-1/2}) convergence and demonstrates the test statistic’s ability to reliably distinguish conditional independence from dependence. Our key contribution is the first unified nonparametric estimation framework attaining the (n^{-1/2}) rate for infinite-support power-series mixtures, accompanied by a theoretically grounded conditional independence test.

Technology Category

Application Category

📝 Abstract
The conditional independence assumption has recently appeared in a growing body of literature on the estimation of multivariate mixtures. We consider here conditionally independent multivariate mixtures of power series distributions with infinite support, to which belong Poisson, Geometric or Negative Binomial mixtures. We show that for all these mixtures, the non-parametric maximum likelihood estimator converges to the truth at the rate $(log (nd))^{1+d/2} n^{-1/2}$ in the Hellinger distance, where $n$ denotes the size of the observed sample and $d$ represents the dimension of the mixture. Using this result, we then construct a new non-parametric estimator based on the maximum likelihood estimator that converges with the parametric rate $n^{-1/2}$ in all $ell_p$-distances, for $p ge 1$. These convergences rates are supported by simulations and the theory is illustrated using the famous Vélib dataset of the bike sharing system of Paris. We also introduce a testing procedure for whether the conditional independence assumption is satisfied for a given sample. This testing procedure is applied for several multivariate mixtures, with varying levels of dependence, and is thereby shown to distinguish well between conditionally independent and dependent mixtures. Finally, we use this testing procedure to investigate whether conditional independence holds for Vélib dataset.
Problem

Research questions and friction points this paper is trying to address.

Estimating convergence rates for non-parametric multivariate mixture models
Developing parametric-rate estimators under conditional independence assumption
Testing conditional independence validity in multivariate mixture distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-parametric MLE with Hellinger convergence rate
Constructed estimator achieving parametric ℓ_p convergence
Testing procedure for conditional independence validation
🔎 Similar Papers
No similar papers found.
Fadoua Balabdaoui
Fadoua Balabdaoui
ETH Zurich
non-parametric statisticsmixture modelsunlinked regressionempirical processes
H
Harald Besdziek
Department of Mathematics, ETH Zurich, Zurich, Switzerland
Y
Yong Wang
Department of Statistics, University of Auckland, Auckland, New Zealand