Scalable Thread-Safety Analysis of Java Classes with CodeQL

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automatically verifying thread safety of Java classes remains challenging due to the complexity of concurrent semantics and absence of a rigorous, verifiable definition grounded in the Java Memory Model (JMM). Method: This paper introduces a formally verifiable definition of thread safety based on JMM’s data-race-freedom principle and constructs, for the first time, a fine-grained property set covering field accesses, synchronization primitives, and method invocations. These properties are encoded into CodeQL using precise semantic modeling and context-sensitive data-flow analysis. Results: Evaluated on the GitHub Top 1000 repositories (over 3.63 million Java classes), our approach achieves high precision and scalability—averaging under two minutes per project (≤200 KLOC) and detecting thousands of real-world concurrency bugs. Several recommended fixes have been adopted by maintainers, and the core queries are now integrated into the official CodeQL repository, enabling automated concurrency auditing via GitHub Actions.

Technology Category

Application Category

📝 Abstract
In object-oriented languages software developers rely on thread-safe classes to implement concurrent applications. However, determining whether a class is thread-safe is a challenging task. This paper presents a highly scalable method to analyze thread-safety in Java classes. We provide a definition of thread-safety for Java classes founded on the correctness principle of the Java memory model, data race freedom. We devise a set of properties for Java classes that are proven to ensure thread-safety. We encode these properties in the static analysis tool CodeQL to automatically analyze Java source code. We perform an evaluation on the top 1000 GitHub repositories. The evaluation comprises 3632865 Java classes; with 1992 classes annotated as @ThreadSafe from 71 repositories. These repositories include highly popular software such as Apache Flink (24.6k stars), Facebook Fresco (17.1k stars), PrestoDB (16.2k starts), and gRPC (11.6k starts). Our queries detected thousands of thread-safety errors. The running time of our queries is below 2 minutes for repositories up to 200k lines of code, 20k methods, 6000 fields, and 1200 classes. We have submitted a selection of detected concurrency errors as PRs, and developers positively reacted to these PRs. We have submitted our CodeQL queries to the main CodeQL repository, and they are currently in the process of becoming available as part of GitHub actions. The results demonstrate the applicability and scalability of our method to analyze thread-safety in real-world code bases.
Problem

Research questions and friction points this paper is trying to address.

Analyzing thread-safety in Java classes automatically
Detecting concurrency errors in large-scale codebases
Ensuring data race freedom in Java memory model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Static analysis with CodeQL for Java
Thread-safety properties based on memory model
Scalable evaluation on large GitHub repositories
🔎 Similar Papers
No similar papers found.
B
Bjørnar Haugstad Jåtten
IT University of Copenhagen
S
Simon Boye Jørgensen
IT University of Copenhagen
Rasmus Petersen
Rasmus Petersen
CodeQL/GitHub
Raúl Pardo
Raúl Pardo
IT University of Copenhagen
Formal MethodsPrivacyProbabilistic Programming