Inferring Attributed Grammars from Parser Implementations

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing structured-input processing systems often lack complete and up-to-date syntactic and semantic specifications; while syntax mining has focused primarily on parsing structure, semantic recovery remains unaddressed. Method: We propose the first approach to automatically infer attribute grammars from recursive-descent parser implementations. Our method combines dynamic execution tracing and program instrumentation to capture runtime behavior, augmented by control-flow analysis and grammar-driven semantic mapping, thereby precisely associating parsing operations with productions and extracting semantic actions. Contribution/Results: This work pioneers syntax mining at the semantic level, enabling fully automated generation of executable attribute grammars that faithfully model input-processing logic. Evaluation across multiple real-world programs demonstrates that the inferred grammars accurately reproduce original parser behavior—enabling novel applications in reverse engineering, specification documentation, and security analysis.

Technology Category

Application Category

📝 Abstract
Software systems that process structured inputs often lack complete and up-to-date specifications, which specify the input syntax and the semantics of input processing. While grammar mining techniques have focused on recovering syntactic structures, the semantics of input processing remains largely unexplored. In this work, we introduce a novel approach for inferring attributed grammars from parser implementations. Given an input grammar, our technique dynamically analyzes the implementation of recursive descent parsers to reconstruct the semantic aspects of input handling, resulting in specifications in the form of attributed grammars. By observing program executions and mapping the program's runtime behavior to the grammar, we systematically extract and embed semantic actions into the grammar rules. This enables comprehensive specification recovery. We demonstrate the feasibility of our approach using an initial set of programs, showing that it can accurately reproduce program behavior through the generated attributed grammars.
Problem

Research questions and friction points this paper is trying to address.

Recovering semantic aspects of input handling from parsers
Inferring attributed grammars from parser implementations
Mapping runtime behavior to grammar for specification recovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic analysis of recursive descent parsers
Mapping runtime behavior to grammar rules
Embedding semantic actions into grammar
🔎 Similar Papers
No similar papers found.