🤖 AI Summary
This paper investigates the distribution of the maximum occupancy in the classical balls-into-bins problem under a “stopping-on-occupancy” rule: balls are sequentially allocated until nearly all bins reach a prescribed capacity. As the number of bins $n o infty$, the extremal distribution fails to converge and instead exhibits log-scale near-periodic oscillations. To characterize this phenomenon, the authors derive a refined asymptotic approximation—the convolution of two independent Gumbel distributions—and rigorously establish its approximation error bound. Methodologically, they introduce a novel lattice point process modeling framework coupled with multiset interpolation, enabling a unified treatment of both Poissonized and deterministic settings by embedding discrete occupancy counts into a continuous-point process on $mathbb{R}$. The analysis further yields precise asymptotics for moments of the maximum occupancy and the probability of ties, substantially advancing the theoretical understanding of extremal behavior in stopped occupancy processes.
📝 Abstract
We revisit a version of the classic occupancy scheme, where balls are thrown until almost all boxes receive a given number of balls. Special cases are widely known as coupon-collectors and dixie cup problems. We show that as the number of boxes tends to infinity, the distribution of the maximal occupancy count does not converge, but can be approximated by a convolution of two Gumbel distributions, with the approximating distribution having oscillations close to periodic on a logarithmic scale. We pursue two approaches: one relies on lattice point processes obtained by poissonisation of the number of balls and boxes, and the other employs interpolation of the multiset of occupancy counts to a point process on reals. This way we gain considerable insight in known asymptotics obtained previously by mostly analytic tools. Further results concern the moments of maximal occupancy counts and ties for the maximum.