🤖 AI Summary
This work addresses the connected maximum common subgraph (MCS) problem on multi-labeled graphs—particularly molecular graphs—under vertex/edge label constraints, aiming to identify conserved structural motifs in bio- and cheminformatics. We propose an exact enumeration framework: (1) construct a labeled modular product graph encoding label compatibility; (2) adapt the Bron–Kerbosch algorithm for efficient maximal clique enumeration; and (3) introduce a graph-kernel-based input graph ordering strategy and label-aware pruning rules to drastically reduce the search space. Experiments on multiple molecular datasets demonstrate that our method achieves 100% precision while outperforming state-of-the-art tools by several-fold in runtime. It scales to数十 (tens of) moderately sized graphs, enabling, for the first time, efficient and exact computation of connected MCS over large-scale multi-graph collections.
📝 Abstract
We present an exact algorithm for computing the connected Maximum Common Subgraph (MCS) across multiple graphs, where edges or vertices may additionally be labeled to account for possible atom types or bond types, a classical labeling used in molecular graphs. Our approach leverages modular product graphs and a modified Bron-Kerbosch algorithm to enumerate maximal cliques, ensuring all intermediate solutions are retained. A pruning heuristic efficiently reduces the modular product size, improving computational feasibility. Additionally, we introduce a graph ordering strategy based on graph-kernel similarity measures to optimize the search process. Our method is particularly relevant for bioinformatics and cheminformatics, where identifying conserved structural motifs in molecular graphs is crucial. Empirical results on molecular datasets demonstrate that our approach is exact, scalable and fast.