🤖 AI Summary
This work investigates how injective morphisms affect the number of runs $r$ in the Burrows–Wheeler Transform (BWT), focusing on *BWT-run sensitivity*—a key metric for compressed indexing. For binary alphabets, we provide the first complete characterization of morphisms that induce bounded run-length increments under BWT, and establish their equivalence to primitivity-preserving morphisms. We design a polynomial-time algorithm to decide whether a given morphism belongs to this class. Furthermore, we uncover novel combinatorial structures and algebraic properties of synchronizing and recognizable morphisms under BWT-run sensitivity. By bridging BWT compressibility, coding theory, and symbolic dynamics, our results furnish both theoretical foundations and algorithmic tools for morphism-based string compression and indexing.
📝 Abstract
We study how the application of injective morphisms affects the number $r$ of equal-letter runs in the Burrows-Wheeler Transform (BWT). This parameter has emerged as a key repetitiveness measure in compressed indexing. We focus on the notion of BWT-run sensitivity after application of an injective morphism. For binary alphabets, we characterize the class of morphisms that preserve the number of BWT-runs up to a bounded additive increase, by showing that it coincides with the known class of primitivity-preserving morphisms, which are those that map primitive words to primitive words. We further prove that deciding whether a given binary morphism has bounded BWT-run sensitivity is possible in polynomial time with respect to the total length of the images of the two letters. Additionally, we explore new structural and combinatorial properties of synchronizing and recognizable morphisms. These results establish new connections between BWT-based compressibility, code theory, and symbolic dynamics.