On the sensitivity of CDAWG-grammars

📅 2025-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates the impact of single-character edits on the size of CDAWG-grammars—grammar-based compressions derived from Compact Directed Acyclic Word Graphs. Addressing the sensitivity of grammar size to such edits, we establish and prove a tight upper bound: the grammar size increases by at most $4e + 4$, where $e$ is the number of edges in the original CDAWG. Our method combines structural analysis of the CDAWG, suffix automaton construction, and localized modeling of edit-induced changes—characterizing how insertions, deletions, or substitutions affect states and transitions, and how these perturbations propagate through grammar derivations. This bound is the first to demonstrate *linear edit robustness* for CDAWG-grammars, contrasting sharply with the exponential sensitivity typical of general grammars. The result provides foundational theoretical guarantees for dynamic text compression and real-time index updates in evolving string collections.

Technology Category

Application Category

📝 Abstract
The compact directed acyclic word graphs (CDAWG) [Blumer et al. 1987] of a string is the minimal compact automaton that recognizes all the suffixes of the string. CDAWGs are known to be useful for various string tasks including text pattern searching, data compression, and pattern discovery. The CDAWG-grammar [Belazzougui&Cunial 2017] is a grammar-based text compression based on the CDAWG. In this paper, we prove that the CDAWG-grammar size $g$ can increase by at most an additive factor of $4e + 4$ than the original after any single-character edit operation is performed on the input string, where $e$ denotes the number of edges in the corresponding CDAWG before the edit.
Problem

Research questions and friction points this paper is trying to address.

Analyzes sensitivity of CDAWG-grammar size to single-character edits.
Proves CDAWG-grammar size increases by at most 4e + 4 after edits.
Focuses on grammar-based text compression using CDAWG structures.
Innovation

Methods, ideas, or system contributions that make the work stand out.

CDAWG-grammar for text compression
Minimal compact automaton for string tasks
Additive factor bound for edit operations
🔎 Similar Papers
No similar papers found.