🤖 AI Summary
Fragmented research data, inconsistent standards, and poor interoperability severely hinder AI’s deep integration across multidisciplinary scientific domains.
Method: This project establishes the world’s first AI- and community-driven, multidisciplinary research data digitization platform. It introduces a customizable, discipline-aware standardized data model enabling cross-domain semantic interoperability; integrates large language models (LLMs) and NLP techniques to develop an AI-powered research assistant supporting intelligent question-answering, automated data entry, and analysis; and adopts a cloud-native architecture augmented by domain-expert collaboration mechanisms.
Contribution/Results: The platform has been deployed across four schools at Westlake University, demonstrably improving data management efficiency and cross-disciplinary collaboration. It provides a paradigm-shifting framework and empirically validated implementation for AI-native research infrastructure—bridging disciplinary silos while preserving domain specificity and advancing FAIR (Findable, Accessible, Interoperable, Reusable) data principles.
📝 Abstract
Research data are the foundation of Artificial Intelligence (AI)-driven science, yet current AI applications remain limited to a few fields with readily available, well-structured, digitized datasets. Achieving comprehensive AI empowerment across multiple disciplines is still out of reach. Present-day research data collection is often fragmented, lacking unified standards, inefficiently managed, and difficult to share. Creating a single platform for standardized data digitization needs to overcome the inherent challenge of balancing between universality (supporting the diverse, ever-evolving needs of various disciplines) and standardization (enforcing consistent formats to fully enable AI). No existing platform accommodates both facets. Building a truly multidisciplinary platform requires integrating scientific domain knowledge with sophisticated computing skills. Researchers often lack the computational expertise to design customized and standardized data recording methods, whereas platform developers rarely grasp the intricate needs of multiple scientific domains. These gaps impede research data standardization and hamper AI-driven progress. In this study, we address these challenges by developing Airalogy (https://airalogy.com), the world's first AI- and community-driven platform that balances universality and standardization for digitizing research data across multiple disciplines. Airalogy represents entire research workflows using customizable, standardized data records and offers an advanced AI research copilot for intelligent Q&A, automated data entry, analysis, and research automation. Already deployed in laboratories across all four schools of Westlake University, Airalogy has the potential to accelerate and automate scientific innovation in universities, industry, and the global research community-ultimately benefiting humanity as a whole.