🤖 AI Summary
This work addresses the challenge of manually deriving Random Self-Reductions (RSRs) for ensuring correctness of numerical programs—a process that is labor-intensive and inherently unscalable. We propose the first automated method for learning RSRs, introducing polynomial-time linear regression as a novel technique for RSR inference. Our approach integrates symbolic analysis with machine learning to establish a theoretically grounded learning framework, and we release RSR-Bench—the first dedicated benchmark for RSR evaluation. The system supports C program analysis and provides both a Python package and a web interface. Experimental evaluation on benchmarks including NLA-DigBench demonstrates that our method significantly outperforms state-of-the-art tools in synthesizing nonlinear invariants. It achieves breakthroughs in scalability, robustness, and sample efficiency—enabling reliable RSR derivation from substantially fewer program executions.
📝 Abstract
The correctness of computations remains a significant challenge in computer science, with traditional approaches relying on automated testing or formal verification. Self-testing/correcting programs introduce an alternative paradigm, allowing a program to verify and correct its own outputs via randomized reductions, a concept that previously required manual derivation. In this paper, we present Bitween, a method and tool for automated learning of randomized (self)-reductions and program properties in numerical programs. Bitween combines symbolic analysis and machine learning, with a surprising finding: polynomial-time linear regression, a basic optimization method, is not only sufficient but also highly effective for deriving complex randomized self-reductions and program invariants, often outperforming sophisticated mixed-integer linear programming solvers. We establish a theoretical framework for learning these reductions and introduce RSR-Bench, a benchmark suite for evaluating Bitween's capabilities on scientific and machine learning functions. Our empirical results show that Bitween surpasses state-of-the-art tools in scalability, stability, and sample efficiency when evaluated on nonlinear invariant benchmarks like NLA-DigBench. Bitween is open-source as a Python package and accessible via a web interface that supports C language programs.