🤖 AI Summary
This work addresses a critical yet overlooked security vulnerability in machine learning systems stemming from inconsistent implementations and insufficient statistical validation of pseudorandom number generators (PRNGs), which can introduce subtle but exploitable weaknesses. To mitigate this risk, the paper presents RNGGuard, a lightweight and automated defense mechanism that combines static source code analysis to identify unsafe random function calls with runtime transparent replacement by cryptographically secure PRNG implementations compliant with established standards. Designed to be compatible with mainstream machine learning frameworks, RNGGuard imposes minimal performance overhead while effectively neutralizing attacks that exploit weaknesses in randomness generation, thereby enhancing the robustness of ML systems without disrupting existing workflows.
📝 Abstract
Machine learning relies on randomness as a fundamental component in various steps such as data sampling, data augmentation, weight initialization, and optimization. Most machine learning frameworks use pseudorandom number generators as the source of randomness. However, variations in design choices and implementations across different frameworks, software dependencies, and hardware backends along with the lack of statistical validation can lead to previously unexplored attack vectors on machine learning systems. Such attacks on randomness sources can be extremely covert, and have a history of exploitation in real-world systems. In this work, we examine the role of randomness in the machine learning development pipeline from an adversarial point of view, and analyze the implementations of PRNGs in major machine learning frameworks. We present RNGGuard to help machine learning engineers secure their systems with low effort. RNGGuard statically analyzes a target library's source code and identifies instances of random functions and modules that use them. At runtime, RNGGuard enforces secure execution of random functions by replacing insecure function calls with RNGGuard's implementations that meet security specifications. Our evaluations show that RNGGuard presents a practical approach to close existing gaps in securing randomness sources in machine learning systems.