🤖 AI Summary
This paper studies distributed expectation estimation in the two-party communication model: Alice and Bob hold distributions $p$ and $q$, respectively, and must estimate $mathbb{E}_{xsim p,ysim q}[f(x,y)]$ to additive error $varepsilon$, where $f$ is a bounded function. To overcome the quadratic dependence $O(R(f)/varepsilon^2)$ of classical protocols on $1/varepsilon$, we propose a novel unbiased randomized protocol achieving linear dependence $O(R(f)/varepsilon)$. We further design optimal protocols for the Equality (EQ) and Greater-Than (GT) functions, and prove that EQ is the simplest communication-wise among all full-rank Boolean functions. Leveraging spectral analysis, randomized sampling, and discrepancy-based techniques, we establish tight upper and lower bounds, attaining theoretical optimality across broad function classes. Our results confirm the asymptotic tightness and universality of the proposed protocol.
📝 Abstract
We study an extension of the standard two-party communication model in which Alice and Bob hold probability distributions $p$ and $q$ over domains $X$ and $Y$, respectively. Their goal is to estimate [ mathbb{E}_{x sim p,, y sim q}[f(x, y)] ] to within additive error $varepsilon$ for a bounded function $f$, known to both parties. We refer to this as the distributed estimation problem. Special cases of this problem arise in a variety of areas including sketching, databases and learning. Our goal is to understand how the required communication scales with the communication complexity of $f$ and the error parameter $varepsilon$.
The random sampling approach -- estimating the mean by averaging $f$ over $O(1/varepsilon^2)$ random samples -- requires $O(R(f)/varepsilon^2)$ total communication, where $R(f)$ is the randomized communication complexity of $f$. We design a new debiasing protocol which improves the dependence on $1/varepsilon$ to be linear instead of quadratic. Additionally we show better upper bounds for several special classes of functions, including the Equality and Greater-than functions. We introduce lower bound techniques based on spectral methods and discrepancy, and show the optimality of many of our protocols: the debiasing protocol is tight for general functions, and that our protocols for the equality and greater-than functions are also optimal. Furthermore, we show that among full-rank Boolean functions, Equality is essentially the easiest.