Learning CNF formulas from uniform random solutions in the local lemma regime

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work studies structural learning of an $n$-variable $k$-CNF formula $Phi$ satisfying the Lovász Local Lemma (LLL) condition from i.i.d. uniform random satisfying assignments—equivalently, learning a Boolean Markov random field under $k$-ary hard constraints. To overcome the high sample complexity of conventional approaches, we innovatively adapt Valiant’s algorithm to the LLL setting with bounded clause intersections, integrating statistical reconstruction, probabilistic analysis, and information-theoretic tools. Theoretically, under the LLL condition and near the satisfiability threshold, our method achieves exact learning with sample complexity $O(log n)$ for fixed $k$, or $ ilde{O}(n^{exp(-sqrt{k})})$ for large $k$, enabling precise recovery from extremely few samples. Moreover, we establish, for the first time, tight information-theoretic lower bounds for both exact and approximate learning of such formulas.

Technology Category

Application Category

📝 Abstract
We study the problem of learning a $n$-variables $k$-CNF formula $Phi$ from its i.i.d. uniform random solutions, which is equivalent to learning a Boolean Markov random field (MRF) with $k$-wise hard constraints. Revisiting Valiant's algorithm (Commun. ACM'84), we show that it can exactly learn (1) $k$-CNFs with bounded clause intersection size under Lov'asz local lemma type conditions, from $O(log n)$ samples; and (2) random $k$-CNFs near the satisfiability threshold, from $widetilde{O}(n^{exp(-sqrt{k})})$ samples. These results significantly improve the previous $O(n^k)$ sample complexity. We further establish new information-theoretic lower bounds on sample complexity for both exact and approximate learning from i.i.d. uniform random solutions.
Problem

Research questions and friction points this paper is trying to address.

Learning k-CNF formulas from uniform random solution samples efficiently
Improving sample complexity from O(n^k) to logarithmic or sublinear bounds
Establishing information-theoretic lower bounds for exact and approximate learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning k-CNFs from uniform random solutions
Using Valiant's algorithm with local lemma conditions
Achieving logarithmic sample complexity for exact learning
Weiming Feng
Weiming Feng
The University of Hong Kong
randomized algorithms
X
Xiongxin Yang
Department of Computer Science, University of California, Santa Barbara
Yixiao Yu
Yixiao Yu
Nanjing University
cs theory
Y
Yiyao Zhang
State Key Laboratory for Novel Software Technology, New Cornerstone Science Laboratory, Nanjing University