🤖 AI Summary
Multi-source spreadsheet integration suffers from labor-intensive manual alignment and lacks formal semantic guarantees. Method: This paper proposes a consensus-free multi-table integration approach: spreadsheet formulas are modeled as algebraic theories, data as models in category theory, and a unified semantic framework is constructed via colimits; cross-table data mappings and formula migrations are realized using Kan extensions, while automated theorem proving verifies semantic preservation and conservativity. Contribution/Results: The method eliminates the need for modeler negotiation and achieves, for the first time, fully automated, semantically consistent integration of heterogeneous engineering spreadsheets. Evaluated on a real-world energy enterprise case, it successfully integrated two independently developed oil well casing pressure test (MASP) spreadsheets—producing a unified, formally verifiable computational spreadsheet—and overcomes the longstanding bottleneck of manual semantic alignment.
📝 Abstract
We describe a method for merging multiple spreadsheets into one sheet, and/or exchanging data among the sheets, by expressing each sheet's formulae as an algebraic (equational) theory and each sheet's values as a model of its theory, expressing the overlap between the sheets as theory and model morphisms, and then performing colimit, lifting, and Kan-extension constructions from category theory to compute a canonically universal integrated theory and model, which can then be expressed as a spreadsheet. Our motivation is to find methods of merging engineering models that do not require consensus (agreement) among the authors of the models being merged, a condition fulfilled by our method because theory and model morphisms are semantics-preserving. We describe a case study of this methodology on a real-world oil and gas calculation at a major energy company, describing the theories and models that arise when integrating two different casing pressure test (MASP) calculation spreadsheets constructed by two non-interacting engineers. We also describe the automated theorem proving burden associated with both verifying the semantics preservation of the overlap mappings as well as verifying the conservativity/consistency of the resulting integrated sheet. We conclude with thoughts on how to apply the methodology to scale engineering efforts across the enterprise.