🤖 AI Summary
Existing type-driven taint analysis suffers from high false-positive rates and difficulties in annotating third-party libraries and generic code. This paper proposes a practical, modular, and incremental type-driven taint analysis method. It introduces the first taint-type inference technique specifically designed for third-party libraries and generic parameters, enabling high-precision taint modeling and propagation. By integrating an extended type system, context-sensitive inference, and static type checking, the approach automatically derives high-quality, source-embedded taint annotations. Evaluated on real-world benchmarks, our method achieves higher recall than mainstream whole-program analyzers, with comparable precision and 2.93–22.9× faster analysis speed. Moreover, the automatically generated annotations match the quality of manually written ones.
📝 Abstract
Many important security properties can be formulated in terms of flows of tainted data, and improved taint analysis tools to prevent such flows are of critical need. Most existing taint analyses use whole-program static analysis, leading to scalability challenges. Type-based checking is a promising alternative, as it enables modular and incremental checking for fast performance. However, type-based approaches have not been widely adopted in practice, due to challenges with false positives and annotating existing codebases. In this paper, we present a new approach to type-based checking of taint properties that addresses these challenges, based on two key techniques. First, we present a new type-based tainting checker with significantly reduced false positives, via more practical handling of third-party libraries and other language constructs. Second, we present a novel technique to automatically infer tainting type qualifiers for existing code. Our technique supports inference of generic type argument annotations, crucial for tainting properties. We implemented our techniques in a tool TaintTyper and evaluated it on real-world benchmarks. TaintTyper exceeds the recall of a state-of-the-art whole-program taint analyzer, with comparable precision, and 2.93X-22.9X faster checking time. Further, TaintTyper infers annotations comparable to those written by hand, suitable for insertion into source code. TaintTyper is a promising new approach to efficient and practical taint checking.