🤖 AI Summary
This work addresses critical limitations in finite-precision random variate generation—including lack of formal correctness guarantees, susceptibility to numerical overflow, suboptimal entropy efficiency, and inconsistent interfaces. We propose an automated synthesis method that constructs cumulative distribution functions (CDFs) directly from finite-precision numerical program semantics, integrating the Knuth–Yao entropy-optimal sampling framework. The approach supports arbitrary binary number formats (e.g., floating-point, fixed-point, posits), requires no arbitrary-precision arithmetic, provably avoids overflow, and achieves information-theoretically optimal entropy consumption. We implement a C library with a unified API, demonstrating superior precision control, entropy efficiency, and automation compared to the GNU Scientific Library (GSL), while matching its runtime performance. Our key contribution is the first general-purpose random variate generator with jointly optimal space and time complexity, accompanied by machine-checked formal verification of correctness and security-critical properties.
📝 Abstract
This article introduces a new approach to principled and practical random variate generation with formal guarantees. The key idea is to first specify the desired probability distribution in terms of a finite-precision numerical program that defines its cumulative distribution function (CDF), and then generate exact random variates according to this CDF. We present a universal and fully automated method to synthesize exact random variate generators given any numerical CDF implemented in any binary number format, such as floating-point, fixed-point, and posits. The method is guaranteed to operate with the same precision used to specify the CDF, does not overflow, avoids expensive arbitrary-precision arithmetic, and exposes a consistent API. The method rests on a novel space-time optimal implementation for the class of generators that attain the information-theoretically optimal Knuth and Yao entropy rate, consuming the least possible number of input random bits per output variate. We develop a random variate generation library using our method in C and evaluate it on a diverse set of ``continuous'' and ``discrete'' distributions, showing competitive runtime with the state-of-the-art GNU Scientific Library while delivering higher accuracy, entropy efficiency, and automation.