🤖 AI Summary
This work addresses the challenge of passively learning symbolic automata over large or infinite alphabets in the absence of a query oracle. The authors propose SAI, the first algorithm to extend the RPNI framework to symbolic automata, combining state merging with an RTI-inspired state splitting mechanism to infer transition predicates of the form $a \leq x < b$ in a top-down manner over monotonic algebraic structures. Theoretical analysis establishes the existence of a polynomial-sized characteristic sample that guarantees identification in the limit. Empirical evaluation demonstrates that the algorithm achieves polynomial sample complexity, substantially overcoming the limitations of traditional passive learning methods on infinite alphabets and offering an efficient new tool for applications such as software verification.
📝 Abstract
Symbolic automata extend classical finite-state automata to handle large or infinite alphabets by labeling transitions by predicates coming from a boolean algebra. Many results from automata theory have been lifted to this model, and it has proved its usefulness for example in multiple software verification applications. Here, we tackle the passive learning problem of identification in the limit, i.e. learning a model from a sample without access to an oracle to query. We provide an algorithm, SAI, that efficiently identifies in the limit symbolic automata over any monotonic algebra where predicates labeling transitions are of the form a <= x < b. The algorithm extends the RPNI framework for passive learning of finite-state automata to symbolic automata thanks to a new splitting operation inspired by RTI, a passive learning algorithm for deterministic real-time automata, a subclass of timed automata. The learning algorithm combines merging of states and splitting of states allowing to infer the predicates on transitions in a top-down fashion. We prove that SAI admits polynomial size characteristic samples.