Cosmos 1.0: a multidimensional map of the emerging technology frontier

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Emerging technology domains lack systematic mapping and classification, hindering comprehensive technology foresight and policy formulation. Method: This paper introduces Cosmos 1.0—the first multidimensional knowledge graph covering 23,544 technologies—built upon a synergistic modeling framework integrating *technology meta-classification*, *semantic embedding*, and *multidimensional assessment*. It proposes a hierarchical taxonomy (ET3/ET7), generates 100-dimensional semantic embeddings using a BERT variant, fuses heterogeneous data from Wikipedia, OpenAlex, and Google Scholar for cross-platform knowledge linking, and trains a supervised term classifier achieving 96.2% accuracy on the manually curated ET100 benchmark. Contribution/Results: We release the open-source ET23k dataset and four interpretable evaluation indices—Awareness, Generality, Deeptech, and Age—with a 0.89 correlation to expert assessments, enabling quantitative support for technological situational awareness, evolutionary analysis, and evidence-based policymaking.

Technology Category

Application Category

📝 Abstract
This paper describes a novel methodology to map the universe of emerging technologies, utilising various source data that contain a rich diversity and breadth of contemporary knowledge to create a new dataset and multiple indices that provide new insights into these technologies. The Cosmos 1.0 dataset is a comprehensive collection of 23,544 technologies (ET23k) structured into a hierarchical model. Each technology is categorised into three meta clusters (ET3) and seven theme clusters (ET7) enhanced by 100-dimensional embedding vectors. Within the cosmos, we manually verify 100 emerging technologies called the ET100. This dataset is enriched with additional indices specifically developed to assess the landscape of emerging technologies, including the Technology Awareness Index, Generality Index, Deeptech, and Age of Tech Index. The dataset incorporates extensive metadata sourced from Wikipedia and linked data from third-party sources such as Crunchbase, Google Books, OpenAlex and Google Scholar, which are used to validate the relevance and accuracy of the constructed indices. Moreover, we trained a classifier to identify whether they are developed"technology"or technology-related"terms".
Problem

Research questions and friction points this paper is trying to address.

Mapping emerging technologies using diverse data sources
Creating hierarchical indices to assess technology landscapes
Validating technology relevance with third-party metadata
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical model with 23,544 technologies
100-dimensional embedding vectors for categorization
Classifier for technology vs. term identification
🔎 Similar Papers
No similar papers found.
X
Xian Gong
University of Technology Sydney, Faculty of Engineering and Information Technology, Sydney, 2007, Australia
P
Paul X. McCarthy
University of Technology Sydney, Faculty of Engineering and Information Technology, Sydney, 2007, Australia
P
Paul X. McCarthy
University of New South Wales, Sydney, 2052, Australia
Colin Griffith
Colin Griffith
Data61
Innovationdigital servicesopen datasmart citiesbroadband applications
Claire McFarland
Claire McFarland
University of Technology Sydney, Faculty of Engineering and Information Technology, Sydney, 2007, Australia
M
Marian-Andrei Rizoiu
University of Technology Sydney, Faculty of Engineering and Information Technology, Sydney, 2007, Australia