🤖 AI Summary
This paper investigates the probabilistic membership problem for context-free languages (CFLs): given a probabilistic word—where each position is independently drawn from a prescribed distribution—compute the probability that it belongs to a target CFL. This problem unifies classical CFL counting and partial word completion. Methodologically, the authors integrate negation-enabled Boolean circuits from knowledge compilation with unambiguous CFL theory, enabling handling of complex structures such as palindromic concatenation and primitive words. They introduce the class of *poly-slicewise-unambiguous* languages and prove that the probabilistic membership problem is solvable in polynomial time for this class; conversely, for general CFLs, the problem is #P-complete via reduction from counting nondeterministic pushdown automata configurations. The results yield efficient probabilistic membership testing for both unambiguous CFLs and certain inherently ambiguous CFLs. Moreover, the meta-problem is shown to be conditionally undecidable within standard axiomatic systems.
📝 Abstract
We study the membership problem to context-free languages L (CFLs) on probabilistic words, that specify for each position a probability distribution on the letters (assuming independence across positions). Our task is to compute, given a probabilistic word, what is the probability that a word drawn according to the distribution belongs to L. This problem generalizes the problem of counting how many words of length n belong to L, or of counting how many completions of a partial word belong to L.
We show that this problem is in polynomial time for unambiguous context-free languages (uCFLs), but can be #P-hard already for unions of two linear uCFLs. More generally, we show that the problem is in polynomial time for so-called poly-slicewise-unambiguous languages, where given a length n we can tractably compute an uCFL for the words of length n in the language. This class includes some inherently ambiguous languages, and implies the tractability of bounded CFLs and of languages recognized by unambiguous polynomial-time counter automata; but we show that the problem can be #P-hard for nondeterministic counter automata, even for Parikh automata with a single counter. We then introduce classes of circuits from knowledge compilation which we use for tractable counting, and show that this covers the tractability of poly-slicewise-unambiguous languages and of some CFLs that are not poly-slicewise-unambiguous. Extending these circuits with negation further allows us to show tractability for the language of primitive words, and for the language of concatenations of two palindromes. We finally show the conditional undecidability of the meta-problem that asks, given a CFG, whether the probabilistic membership problem for that CFG is tractable or #P-hard.