π€ AI Summary
Chemical experimental data exhibit structural complexity and semantic heterogeneity, hindering AI-driven scientific discovery. To address this, we introduce the first Chemotion Knowledge Graph grounded in the Basic Formal Ontology (BFO), establishing an ontology-aligned semantic representation spanning electronic lab notebooks, molecular structures, and reaction processes. We design a customized semantic pipeline integrating Chemotion API-based metadata extraction, bidirectional JSON-LD/RDF conversion, and SPARQL CONSTRUCT rule-based mapping to enable fully automated transformation of raw experimental data into a FAIR-compliant knowledge graph. The resulting graph is openly released and hosted by FIZ Karlsruhe, ensuring findability, accessibility, interoperability, and reusability (FAIR) of chemical research data. This work delivers a scalable, ontology-based infrastructure that bridges domain-specific knowledge engineering with AI methodologies, facilitating reproducible, semantics-aware computational chemistry research.
π Abstract
Chemistry is an example of a discipline where the advancements of technology have led to multi-level and often tangled and tricky processes ongoing in the lab. The repeatedly complex workflows are combined with information from chemical structures, which are essential to understand the scientific process. An important tool for many chemists is Chemotion, which consists of an electronic lab notebook and a repository. This paper introduces a semantic pipeline for constructing the BFO-compliant Chemotion Knowledge Graph, providing an integrated, ontology-driven representation of chemical research data. The Chemotion-KG has been developed to adhere to the FAIR (Findable, Accessible, Interoperable, Reusable) principles and to support AI-driven discovery and reasoning in chemistry. Experimental metadata were harvested from the Chemotion API in JSON-LD format, converted into RDF, and subsequently transformed into a Basic Formal Ontology-aligned graph through SPARQL CONSTRUCT queries. The source code and datasets are publicly available via GitHub. The Chemotion Knowledge Graph is hosted by FIZ Karlsruhe Information Service Engineering. Outcomes presented in this work were achieved within the Leibniz Science Campus ``Digital Transformation of Research'' (DiTraRe) and are part of an ongoing interdisciplinary collaboration.