🤖 AI Summary
Symbolic execution frequently terminates prematurely due to unanalyzable external functions—such as native methods or third-party library calls—for which no source-level implementation is available. To address this, we propose the first genetic programming–based approach for automatically synthesizing symbolic stubs: it collects input-output samples via random testing, then evolves algebraic expressions to approximate the target function’s behavior—without requiring manual modeling or auxiliary contextual information. Our method tightly integrates symbolic execution, randomized testing, and SMT-based constraint solving to uncover language-specific semantics and identify boundary conditions. Experimental evaluation demonstrates that 55% of the tested functions achieve over 90% behavioral approximation accuracy; moreover, our synthesized stubs successfully recover critical execution paths inaccessible to conventional stubbing techniques, yielding substantial improvements in both path coverage and testing depth.
📝 Abstract
Symbolic execution is a powerful technique for software testing, but suffers from limitations when encountering external functions, such as native methods or third-party libraries. Existing solutions often require additional context, expensive SMT solvers, or manual intervention to approximate these functions through symbolic stubs. In this work, we propose a novel approach to automatically generate symbolic stubs for external functions during symbolic execution that leverages Genetic Programming. When the symbolic executor encounters an external function, AutoStub generates training data by executing the function on randomly generated inputs and collecting the outputs. Genetic Programming then derives expressions that approximate the behavior of the function, serving as symbolic stubs. These automatically generated stubs allow the symbolic executor to continue the analysis without manual intervention, enabling the exploration of program paths that were previously intractable. We demonstrate that AutoStub can automatically approximate external functions with over 90% accuracy for 55% of the functions evaluated, and can infer language-specific behaviors that reveal edge cases crucial for software testing.