On Computing Total Variation Distance Between Mixtures of Product Distributions

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work addresses the problem of efficiently computing the total variation distance between two mixture models of product distributions defined over discrete domains. For the general case, it presents the first polynomial-time randomized algorithm that achieves a $(1\pm\varepsilon)$ multiplicative approximation guarantee. In the specific setting where the mixtures are supported on the Boolean hypercube, the paper devises a deterministic algorithm capable of computing the distance exactly in $\mathrm{poly}(n,2^{O(k_1+k_2)})$ time, where $k_1$ and $k_2$ denote the number of components in each mixture. Furthermore, the study establishes that the problem is \#P-hard under certain conditions, thereby delineating the computational complexity boundary of this fundamental statistical task.

📝 Abstract

We study the problem of approximating the total variation distance between two mixtures of product distributions over an $n$-dimensional discrete domain. Given two mixtures $\mathbb{P}$ and $\mathbb{Q}$ with $k_1$ and $k_2$ product distributions over $[q]^n$, respectively, we give a randomized algorithm that approximates $d_{\mathrm{TV}}\left({\mathbb{P}},{\mathbb{Q}}\right)$ within a multiplicative error of $(1\pm \varepsilon)$ in time $\mathrm{poly}((nq)^{k_1+k_2},1/\varepsilon)$. We also study the special case of mixtures of Boolean subcubes over $\{0,1\}^n$. For this class, we give a deterministic algorithm that exactly computes the total variation distance in time $\mathrm{poly}(n,2^{O(k_1+k_2)})$, and show that exact computation is $\#\mathsf{P}$-hard when $k_1+k_2=Θ(n)$.

Problem

Research questions and friction points this paper is trying to address.

total variation distance

mixture models

product distributions

computational complexity

Boolean subcubes

Innovation

Methods, ideas, or system contributions that make the work stand out.

total variation distance

mixture of product distributions

Boolean subcubes