Lost in Translation? Converting RegExes for Log Parsing into Dynatrace Pattern Language

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Enterprises face high-cost, error-prone manual translation when migrating regular expressions (RegEx) to commercial log analysis platforms such as Dynatrace. This paper introduces Reptile, the first approach enabling automated, semantically equivalent translation from RegEx to Dynatrace’s Pattern Language (DPL). Its core innovation lies in a hybrid architecture integrating a deterministic rule-based engine with GPT-4: the rule layer ensures safe, structure-preserving base mappings, while the LLM layer performs context-aware semantic optimization and best-effort compatibility repair for non-trivial, non-bijectively translatable cases. Evaluated on a real-world dataset of 946 RegEx patterns, Reptile achieves a 73.7% safety-constrained translation rate. For 23 complex, expert-validated cases, it attains F1-score and Matthews Correlation Coefficient (MCC) both exceeding 0.91—demonstrating substantial improvements in migration accuracy and efficiency.

Technology Category

Application Category

📝 Abstract
Log files provide valuable information for detecting and diagnosing problems in enterprise software applications and data centers. Several log analytics tools and platforms were developed to help filter and extract information from logs, typically using regular expressions (RegExes). Recent commercial log analytics platforms provide domain-specific languages specifically designed for log parsing, such as Grok or the Dynatrace Pattern Language (DPL). However, users who want to migrate to these platforms must manually convert their RegExes into the new pattern language, which is costly and error-prone. In this work, we present Reptile, which combines a rule-based approach for converting RegExes into DPL patterns with a best-effort approach for cases where a full conversion is impossible. Furthermore, it integrates GPT-4 to optimize the obtained DPL patterns. The evaluation with 946 RegExes collected from a large company shows that Reptile safely converted 73.7% of them. The evaluation of Reptile's pattern optimization with 23 real-world RegExes showed an F1-score and MCC above 0.91. These results are promising and have ample practical implications for companies that migrate to a modern log analytics platform, such as Dynatrace.
Problem

Research questions and friction points this paper is trying to address.

Convert RegExes to Dynatrace Pattern Language automatically
Reduce manual effort in log analytics platform migration
Optimize DPL patterns using GPT-4 for better accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rule-based RegEx to DPL conversion
GPT-4 for DPL pattern optimization
Best-effort approach for partial conversions
🔎 Similar Papers
No similar papers found.