🤖 AI Summary
This work addresses the design of efficient and proper sample compression schemes for hypergraphs induced by balls in graphs. Focusing on sparse graphs with bounded treewidth or clique-width, the authors leverage structural sparsity together with VC-dimension theory to propose the first tight proper compression scheme, overcoming the prior limitation that only improper schemes were known. Their scheme simultaneously recovers both the original labels and the consistent hyperedges (balls), achieving a compression size of $O(t \log t)$ for graphs of treewidth $t$, which is tight up to logarithmic factors. The approach further extends to graphs of bounded clique-width, demonstrating broad applicability within structurally sparse graph classes.
📝 Abstract
Sample compression schemes were defined by Littlestone and Warmuth (1986) as an abstraction of the structure underlying many learning algorithms. In a sample compression scheme, we are given a large sample of vertices of a fixed hypergraph with labels indicating the containment in some hyperedge. The task is to compress the sample in such a way that we can retrieve the labels of the original sample. The size of a sample compression scheme is the amount of information that is kept in the compression. Every hypergraph with a sample compression scheme of bounded size must have bounded VC-dimension. Conversely, Moran and Yehudayoff (J. ACM, 2016) showed that every hypergraph of bounded VC-dimension admits a sample compression scheme of bounded size. We study a specific class of hypergraphs emerging from balls in graphs. The schemes that we construct (contrary to the ones constructed by Moran and Yehudayoff) are \textit{proper}, meaning that we retrieve not only the labeling of the original sample but also a hyperedge (ball) consistent with the original labeling. First, we prove that for every graph $G$ of treewidth at most $t$, the hypergraph of balls in $G$ has a proper sample compression scheme of size $\mathcal{O}(t\log t)$; this is tight up to the logarithmic factor and improves the quadratic (improper) bound that follows from the result of Moran and Yehudayoff. Second, we prove an analogous result for graphs of cliquewidth at most $t$.