Universal Scalability in Declarative Program Analysis (with Choice-Based Combination Pruning)

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the pervasive scalability bottleneck in Datalog program analysis, this paper proposes a general-purpose dynamic pruning method based on the *choice* construct. Our approach achieves adaptive scale control without modifying the underlying analysis logic, by modeling the working set via projection-driven constraints and applying incremental evaluation pruning. The key contribution is the first generalization of Soufflé’s native *choice* mechanism into a declarative, nearly universal pruning framework—enabling efficient, soundness-preserving optimization across arbitrary Datalog analysis architectures. Experimental evaluation on Doop (for Java analysis) and Gigahorse (for Ethereum smart contracts) demonstrates an average speedup exceeding 20×, with negligible precision loss—even on the most challenging inputs—thereby significantly outperforming existing static or analysis-specific pruning techniques.

Technology Category

Application Category

📝 Abstract
In this work, we present a simple, uniform, and elegant solution to the problem, with stunning practical effectiveness and application to virtually any Datalog-based analysis. The approach consists of leveraging the choice construct, supported natively in modern Datalog engines like Souffl'e. The choice construct allows the definition of functional dependencies in a relation and has been used in the past for expressing worklist algorithms. We show a near-universal construction that allows the choice construct to flexibly limit evaluation of predicates. The technique is applicable to practically any analysis architecture imaginable, since it adaptively prunes evaluation results when a (programmer-controlled) projection of a relation exceeds a desired cardinality. We apply the technique to probably the largest, pre-existing Datalog analysis frameworks in existence: Doop (for Java bytecode) and the main client analyses from the Gigahorse framework (for Ethereum smart contracts). Without needing to understand the existing analysis logic and with minimal, local-only changes, the performance of each framework increases dramatically, by over 20x for the hardest inputs, with near-negligible sacrifice in completeness.
Problem

Research questions and friction points this paper is trying to address.

Enhances scalability in declarative program analysis.
Uses choice construct for flexible predicate evaluation.
Improves performance of Datalog-based frameworks significantly.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages choice construct in Datalog engines
Adaptively prunes evaluation results based on cardinality
Enhances performance of existing analysis frameworks significantly
🔎 Similar Papers
No similar papers found.