Consensus-Free Spreadsheet Integration

📅 2022-09-28
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-source spreadsheet integration suffers from labor-intensive manual alignment and lacks formal semantic guarantees. Method: This paper proposes a consensus-free multi-table integration approach: spreadsheet formulas are modeled as algebraic theories, data as models in category theory, and a unified semantic framework is constructed via colimits; cross-table data mappings and formula migrations are realized using Kan extensions, while automated theorem proving verifies semantic preservation and conservativity. Contribution/Results: The method eliminates the need for modeler negotiation and achieves, for the first time, fully automated, semantically consistent integration of heterogeneous engineering spreadsheets. Evaluated on a real-world energy enterprise case, it successfully integrated two independently developed oil well casing pressure test (MASP) spreadsheets—producing a unified, formally verifiable computational spreadsheet—and overcomes the longstanding bottleneck of manual semantic alignment.
📝 Abstract
We describe a method for merging multiple spreadsheets into one sheet, and/or exchanging data among the sheets, by expressing each sheet's formulae as an algebraic (equational) theory and each sheet's values as a model of its theory, expressing the overlap between the sheets as theory and model morphisms, and then performing colimit, lifting, and Kan-extension constructions from category theory to compute a canonically universal integrated theory and model, which can then be expressed as a spreadsheet. Our motivation is to find methods of merging engineering models that do not require consensus (agreement) among the authors of the models being merged, a condition fulfilled by our method because theory and model morphisms are semantics-preserving. We describe a case study of this methodology on a real-world oil and gas calculation at a major energy company, describing the theories and models that arise when integrating two different casing pressure test (MASP) calculation spreadsheets constructed by two non-interacting engineers. We also describe the automated theorem proving burden associated with both verifying the semantics preservation of the overlap mappings as well as verifying the conservativity/consistency of the resulting integrated sheet. We conclude with thoughts on how to apply the methodology to scale engineering efforts across the enterprise.
Problem

Research questions and friction points this paper is trying to address.

Merging spreadsheets without requiring consensus among authors
Using category theory to integrate algebraic theories and models
Automated verification of semantics preservation and consistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Algebraic theories express spreadsheet formulae
Category theory constructs merge sheets universally
Semantics-preserving morphisms enable consensus-free integration
B
B. Baylor
Chevron
E
Eric Daimler
Chevron
James Hansen
James Hansen
Chevron
E
Esteban Montero
Chevron
Ryan Wisnesky
Ryan Wisnesky
Conexus AI
Computer Science