Extracting Overlapping Microservices from Monolithic Code via Deep Semantic Embeddings and Graph Neural Network-Based Soft Clustering

πŸ“… 2025-08-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing microservice extraction methods predominantly employ hard clustering, resulting in high inter-service coupling and low intra-service cohesion. To address this, we propose Mo2oMβ€”a soft-clustering framework that jointly models deep semantic embeddings and structural dependencies from method call graphs for the first time. Leveraging graph neural networks, Mo2oM enables probabilistic multi-membership of methods across components, emulating expert-level component duplication practices and supporting microservice decomposition with functional overlap. Evaluated on four open-source systems, Mo2oM outperforms eight state-of-the-art baselines: it improves structural modularity by 40.97%, reduces inter-service call overhead by 58%, decreases interface count by 26.16%, and yields more balanced service size distributions. These gains significantly enhance system scalability, maintainability, and deployment flexibility.

Technology Category

Application Category

πŸ“ Abstract
Modern software systems are increasingly shifting from monolithic architectures to microservices to enhance scalability, maintainability, and deployment flexibility. Existing microservice extraction methods typically rely on hard clustering, assigning each software component to a single microservice. This approach often increases inter-service coupling and reduces intra-service cohesion. We propose Mo2oM (Monolithic to Overlapping Microservices), a framework that formulates microservice extraction as a soft clustering problem, allowing components to belong probabilistically to multiple microservices. This approach is inspired by expert-driven decompositions, where practitioners intentionally replicate certain software components across services to reduce communication overhead. Mo2oM combines deep semantic embeddings with structural dependencies extracted from methodcall graphs to capture both functional and architectural relationships. A graph neural network-based soft clustering algorithm then generates the final set of microservices. We evaluate Mo2oM on four open-source monolithic benchmarks and compare it against eight state-of-the-art baselines. Our results demonstrate that Mo2oM achieves improvements of up to 40.97% in structural modularity (balancing cohesion and coupling), 58% in inter-service call percentage (communication overhead), 26.16% in interface number (modularity and decoupling), and 38.96% in non-extreme distribution (service size balance) across all benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Extracting overlapping microservices from monolithic code effectively
Reducing inter-service coupling and increasing intra-service cohesion
Combining semantic and structural data for optimal microservice decomposition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep semantic embeddings capture functional relationships
Graph neural network enables soft clustering
Probabilistic component assignment reduces communication overhead
πŸ”Ž Similar Papers
No similar papers found.
Morteza Ziabakhsh
Morteza Ziabakhsh
Bachelor’s Graduate, University of Guilan
deep learningmachine learningsoftware engineering
Kiyan Rezaee
Kiyan Rezaee
Student at Guilan University
Information retrievalNatural Language Processing
S
Sadegh Eskandari
Department of Computer Science, University of Guilan
S
Seyed Amir Hossein Tabatabaei
Department of Computer Science, University of Guilan
M
Mohammad M. Ghassemi
Department of Computer Science and Engineering, Michigan State University