Capturing Semantic Flow of ML-based Systems

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning systems—particularly deep neural networks (DNNs) and large language models (LLMs)—exhibit opaque internal decision-making processes, rendering them incompatible with conventional dynamic analysis techniques. Method: This paper introduces the “semantic flow” paradigm: an abstraction extending control flow to the model’s semantic level by integrating multi-source execution states—including program traces, layer-wise activations, and step-wise embeddings—to construct interpretable semantic flow graphs. Contribution/Results: We establish the first unified, reusable semantic flow abstraction framework, enabling adaptation of coverage analysis and mutation testing to ML software. Empirical evaluation on DNN-based vision models and LLM-based agents demonstrates that our approach effectively generates structured semantic flow graphs, significantly enhancing behavioral observability and analyzability of ML systems. This work provides foundational support for dynamic analysis in ML software engineering.

Technology Category

Application Category

📝 Abstract
ML-based systems are software systems that incorporates machine learning components such as Deep Neural Networks (DNNs) or Large Language Models (LLMs). While such systems enable advanced features such as high performance computer vision, natural language processing, and code generation, their internal behaviour remain largely opaque to traditional dynamic analysis such as testing: existing analysis typically concern only what is observable from the outside, such as input similarity or class label changes. We propose semantic flow, a concept designed to capture the internal behaviour of ML-based system and to provide a platform for traditional dynamic analysis techniques to be adapted to. Semantic flow combines the idea of control flow with internal states taken from executions of ML-based systems, such as activation values of a specific layer in a DNN, or embeddings of LLM responses at a specific inference step of LLM agents. The resulting representation, summarised as semantic flow graphs, can capture internal decisions that are not explicitly represented in the traditional control flow of ML-based systems. We propose the idea of semantic flow, introduce two examples using a DNN and an LLM agent, and finally sketch its properties and how it can be used to adapt existing dynamic analysis techniques for use in ML-based software systems.
Problem

Research questions and friction points this paper is trying to address.

Captures internal behavior of ML-based systems
Adapts dynamic analysis for ML components
Represents internal decisions via semantic flow graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic flow captures ML system internal behavior.
Combines control flow with ML internal states.
Enables traditional dynamic analysis for ML systems.
🔎 Similar Papers
No similar papers found.