Ranked MSO-enumeration over compressed words

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

166K/year
🤖 AI Summary
This work addresses the problem of efficiently enumerating the results of a fixed monadic second-order (MSO) query over strings represented in straight-line program (SLP) compressed form, while outputting them in an MSO-definable order. The authors extend the factorization tree technique to the setting of compressed strings for the first time, integrating SLP representations, MSO logic, and constant-delay enumeration methods. Their approach achieves linear preprocessing time and constant delay between consecutive outputs in the desired order. As a consequence, it enables efficient left-to-right enumeration of symbols resulting from any fixed multi-regular function applied to the compressed string. This significantly enhances the practicality and efficiency of structured querying over highly compressed data.
📝 Abstract
It is shown that the ranked query enumeration problem for a fixed MSO-query on strings can be solved with linear preprocessing and constant delay in the grammar-compressed setting, where the input string is given by a so-called straight-line program, i.e., a context-free grammar that produces exactly one string. Moreover, `ranked' means that the output tuples of the MSO-query are printed in a specific order that has to be MSO-definable. This is the first result for ranked query enumeration on compressed data. A corollary of this result is that for a fixed polyregular function $f$ and a word $w$ that is given by a straight-line program of size $n$, one can list after preprocessing time $\mathcal{O}(n)$ the symbols in $f(w)$ from left to right with constant delay, which generalizes a result of Bojanczyk for the case where $w$ is uncompressed. The proofs for these results are based on factorization trees, which are made accessible to the grammar-compressed setting (a contribution of independent interest).
Problem

Research questions and friction points this paper is trying to address.

ranked enumeration
MSO query
compressed words
straight-line program
grammar-compressed
Innovation

Methods, ideas, or system contributions that make the work stand out.

ranked enumeration
MSO queries
grammar-compressed strings
straight-line programs
factorization trees