🤖 AI Summary
In nanoscale computed tomography (nano-CT) experiments, missing metadata severely impairs data discoverability, interpretability, and reuse, resulting in poor FAIR (Findable, Accessible, Interoperable, Reusable) compliance. To address this, we propose an “ontology-first” paradigm that tightly integrates OWL ontology modeling with the electronic lab notebook Herbie—establishing a domain-consensus metadata schema and semantic ontology *prior* to experiment initiation. This enables real-time semantic annotation and automated FAIRification during measurement. The system flexibly captures instrument-specific metadata across diverse synchrotron-based nano-CT configurations and constructs an RDF-based knowledge graph, augmented with a natural-language question-answering interface that abstracts underlying graph structure for intuitive, efficient querying. Our work delivers a reusable, community-aligned nano-CT ontology and standardized metadata schema, significantly enhancing data discoverability, interoperability, and reuse efficiency.
📝 Abstract
In recent years, the importance of well-documented metadata has been discussed increasingly in many research fields. Making all metadata generated during scientific research available in a findable, accessible, interoperable, and reusable (FAIR) manner remains a significant challenge for researchers across fields. Scientific communities are agreeing to achieve this by making all data available in a semantically annotated knowledge graph using semantic web technologies. Most current approaches do not gather metadata in a consistent and community-agreed standardized way, and there are insufficient tools to support the process of turning them into a knowledge graph. We present an example solution in which the creation of a schema and ontology are placed at the beginning of the scientific process which is then - using the electronic laboratory notebook framework Herbie - turned into a bespoke data collection platform to facilitate validation and semantic annotation of the metadata immediately during an experiment. Using the example of synchrotron radiation-based nano computed tomography measurements, we present a holistic approach which can capture the complex metadata of such research instruments in a flexible and straightforward manner. Different instrument setups of this beamline can be considered, allowing a user-friendly experience. We show how Herbie turns all semantic documents into an accessible user interface, where all data entered automatically fulfills all requirements of being FAIR, and present how data can be directly extracted via competency questions without requiring familiarity with the fine-grained structure of the knowledge graph.