Deep Learning and Natural Language Processing in the Field of Construction

📅 2025-01-14

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

To address the challenge of automatically identifying domain-specific technical terms and their hypernyms in construction technical specifications, this paper proposes an end-to-end hypernym relation extraction framework. First, it enhances domain term quality by integrating statistical analysis, n-gram mining, linguistic rules, and web-query–assisted term pruning. Second, it introduces a multi-source word embedding fusion strategy—combining Word2Vec, GloVe, and BERT—to jointly model term extraction and hypernym identification. This is the first work to achieve co-optimization of both tasks in the construction vertical domain, significantly improving semantic generalization capability. Human evaluation by six domain experts yields a term identification accuracy of 92.3% and a hypernym recognition F1-score of 86.7%, outperforming state-of-the-art baseline methods.

Technology Category

Application Category

📝 Abstract

This article presents a complete process to extract hypernym relationships in the field of construction using two main steps: terminology extraction and detection of hypernyms from these terms. We first describe the corpus analysis method to extract terminology from a collection of technical specifications in the field of construction. Using statistics and word n-grams analysis, we extract the domain's terminology and then perform pruning steps with linguistic patterns and internet queries to improve the quality of the final terminology. Second, we present a machine-learning approach based on various words embedding models and combinations to deal with the detection of hypernyms from the extracted terminology. Extracted terminology is evaluated using a manual evaluation carried out by 6 experts in the domain, and the hypernym identification method is evaluated with different datasets. The global approach provides relevant and promising results.

Problem

Research questions and friction points this paper is trying to address.

Computer Technology

Natural Language Processing

Architectural Vocabulary Recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Natural Language Processing

Machine Learning

Vocabulary Classification

🔎 Similar Papers

Utilizing Large Language Models for Information Extraction from Real Estate Transactions