🤖 AI Summary
Legacy systems written in COBOL, PL/I, or Assembly—common in banking and telecommunications—are often undocumented and lack original developers, hindering comprehension and modernization.
Method: This paper proposes a multi-language, cross-platform, customizable framework for constructing software knowledge graphs and interactively defining architectural boundaries. It integrates static code analysis, data schema parsing, and custom ontology modeling to enable expert-guided, incremental analysis of source code and data architecture, automatically identifying business- and data-driven logical boundaries and visualizing cross-boundary dependencies.
Contribution/Results: The framework introduces the first knowledge-graph-driven approach for progressive modernization path planning and impact analysis. Evaluated on two real-world industrial systems, it significantly improves system understanding efficiency and enhances the accuracy of modernization strategy design.
📝 Abstract
Industries such as banking, telecom and airlines - often have large software systems that are several decades old. Many of these systems are written in old programming languages such as COBOL, PL/1, Assembler, etc. In many cases, the documentation is not updated, and those who developed/designed these systems are no longer around. Understanding these systems for either modernization or even regular maintenance has been a challenge. An extensive application may have natural boundaries based on its code dependencies and architecture. There are also other logical boundaries in an enterprise setting driven by business functions, data domains, etc. Due to these complications, the system architects generally plan their modernization across these logical boundaries in parts, thereby adopting an incremental approach for the modernization journey of the entire system. In this work, we present a software system analysis tool that allows a subject matter expert (SME) or system architect to analyze a large software system incrementally. We analyze the source code and other artifacts (such as data schema) to create a knowledge graph using a customizable ontology/schema. Entities and relations in our ontology can be defined for any combination of programming languages and platforms. Using this knowledge graph, the analyst can then define logical boundaries around dependent entities (e.g. Programs, Transactions, Database Tables etc.). Our tool then presents different views showcasing the dependencies from the newly defined boundary to/from the other logical groups of the system. This exercise is repeated interactively to 1) Identify the entities and groupings of interest for a modernization task and 2) Understand how a change in one part of the system may affect the other parts. To validate the efficacy of our tool, we provide an initial study of our system on two client applications.