🤖 AI Summary
Existing approaches to JSON Schema inclusion checking struggle to balance efficiency and completeness: rule-based methods are efficient but incomplete, while instance-generation techniques are complete yet computationally expensive. This work proposes a refutation normalization technique that synergistically combines the efficiency of rule-based reasoning with the completeness of instance generation, enabling fast and reliable inclusion checking through optimized logical inference paths. Evaluated on both real-world and synthetic datasets, the proposed method significantly outperforms state-of-the-art tools, achieving theoretical completeness while substantially improving verification efficiency. The approach effectively supports complex practical applications and advances the practical applicability boundary of JSON Schema validation technologies.
📝 Abstract
JSON Schema is the de facto standard for describing the structure of JSON documents. Reasoning about JSON Schema inclusion - whether every instance satisfying a schema S1 also satisfies a schema S2 -is a key building block for a variety of tasks, including version and API compatibility checks, schema refactoring tools, and large-scale schema corpus analysis. Existing approaches fall into two families: rule-based algorithms that are efficient but incomplete and witness generation-based algorithms that are complete but oftentimes extremely slow. This paper introduces a new approach that reconciles the efficiency of rule-based procedures with the completeness of the witness-generation technique, by enriching the latter with a specialized form of normalization. This refutational normalization paves the way for use-cases that are too hard for current tools. Our experiments with real-world and synthetic schemas show that the refutational normalization greatly advances the state-of-the-art in JSON Schema inclusion checking.