π€ AI Summary
Existing microservice extraction methods predominantly employ hard clustering, resulting in high inter-service coupling and low intra-service cohesion. To address this, we propose Mo2oMβa soft-clustering framework that jointly models deep semantic embeddings and structural dependencies from method call graphs for the first time. Leveraging graph neural networks, Mo2oM enables probabilistic multi-membership of methods across components, emulating expert-level component duplication practices and supporting microservice decomposition with functional overlap. Evaluated on four open-source systems, Mo2oM outperforms eight state-of-the-art baselines: it improves structural modularity by 40.97%, reduces inter-service call overhead by 58%, decreases interface count by 26.16%, and yields more balanced service size distributions. These gains significantly enhance system scalability, maintainability, and deployment flexibility.
π Abstract
Modern software systems are increasingly shifting from monolithic architectures to microservices to enhance scalability, maintainability, and deployment flexibility. Existing microservice extraction methods typically rely on hard clustering, assigning each software component to a single microservice. This approach often increases inter-service coupling and reduces intra-service cohesion. We propose Mo2oM (Monolithic to Overlapping Microservices), a framework that formulates microservice extraction as a soft clustering problem, allowing components to belong probabilistically to multiple microservices. This approach is inspired by expert-driven decompositions, where practitioners intentionally replicate certain software components across services to reduce communication overhead. Mo2oM combines deep semantic embeddings with structural dependencies extracted from methodcall graphs to capture both functional and architectural relationships. A graph neural network-based soft clustering algorithm then generates the final set of microservices. We evaluate Mo2oM on four open-source monolithic benchmarks and compare it against eight state-of-the-art baselines. Our results demonstrate that Mo2oM achieves improvements of up to 40.97% in structural modularity (balancing cohesion and coupling), 58% in inter-service call percentage (communication overhead), 26.16% in interface number (modularity and decoupling), and 38.96% in non-extreme distribution (service size balance) across all benchmarks.