Structure- and Event-Driven Frameworks for State Machine Modeling with Large Language Models

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the inefficiency and error-proneness of manual UML state machine design, as well as the limitations of existing automated approaches in handling unstructured natural language requirements. To overcome these challenges, the authors propose a novel large language model (LLM)-based method that introduces two distinct modeling frameworks—structure-driven and event-driven—for state machine generation, complemented by a hybrid refinement strategy to iteratively optimize initial outputs. Experimental results demonstrate that Claude 3.5 Sonnet achieves F1 scores of 0.90 for states and 0.75 for transitions under a single-step prompting setup. Furthermore, the hybrid approach significantly enhances GPT-4o’s performance, bringing it close to Claude’s level, thereby validating the effectiveness and generalizability of the proposed framework.
📝 Abstract
UML state machine design is a critical process in software engineering. Traditionally, state machines are manually crafted by experienced engineers based on natural language requirements-a time-consuming and error-prone procedure. Many automated approaches exist but they require structured NL requirements. In this paper, we investigate the capabilities of current Large Language Models to fully automate UML state machine generation via specialized State Machine Frameworks (SMFs) from non-structured NL requirements. We evaluate two types of state-of-the-art LLMs using single-step and multi-step prompting approaches: a non-reasoning LLM GPT-4o and a reasoning-focused LLM Claude 3.5 Sonnet, and introduce a novel Hybrid Approach that uses the output from a Single-Prompt Baseline as an initial draft state machine, which is then refined through an SMF. In our study, two distinct SMFs are developed based on human approaches: (i) a Structure-Driven SMF, in which state machine components (states, transitions, guards, actions, etc.) are generated in sequential steps, and (ii) an Event-Driven SMF, where identified events iteratively guide state machine construction. Our experiments indicate that while LLMs demonstrate a promising ability to generate state machine models from the Single-Prompt Baseline (e.g., F1-scores of 0.90 for states and 0.75 for transitions using Claude 3.5 Sonnet), their performance is not yet fully sufficient for a fully automated solution (e.g., F1-scores of 0.23 for guards and 0.00 for actions for GPT-4o). Our proposed Hybrid Approach improves the performance of the non-reasoning LLM (GPT-4o) to a similar level as the reasoning LLM (Claude 3.5 Sonnet) but does not further improve the reasoning LLM. Our evaluation highlights both the potential and the limitations of current LLMs for automated state machine design, providing a baseline for future research in this domain.
Problem

Research questions and friction points this paper is trying to address.

UML state machine
Large Language Models
natural language requirements
automated modeling
state machine generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

State Machine Frameworks
Large Language Models
Structure-Driven Modeling
Event-Driven Modeling
Hybrid Approach
🔎 Similar Papers
No similar papers found.
S
Samer Abdulkarim
Electrical and Computer Engineering, McGill University, Montreal, Canada
E
Evan Boyd
Electrical and Computer Engineering, McGill University, Montreal, Canada
K
Karl Bridi
Electrical and Computer Engineering, McGill University, Montreal, Canada
A
Alec Tufenkjian
Electrical and Computer Engineering, McGill University, Montreal, Canada
Boqi Chen
Boqi Chen
PhD Candidate, McGill University
model-based software engineeringlarge language modeldeep learningabstract interpretation
Gunter Mussbacher
Gunter Mussbacher
Associate Professor, McGill University
Requirements Eng.Model-Driven Eng.Software Language Eng.Human ValuesSustainability