Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model

التفاصيل البيبلوغرافية
العنوان: Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model
المؤلفون: Di Wang, Shixin Jiang, Ren Li, Jianxi Yang, Tianjin Mo, Dong Li
المصدر: Advanced Engineering Informatics. 50:101416
بيانات النشر: Elsevier BV, 2021.
سنة النشر: 2021
مصطلحات موضوعية: Artificial neural network, business.industry, Computer science, Bigram, Context (language use), Building and Construction, computer.software_genre, Bridge (interpersonal), Information extraction, Named-entity recognition, Artificial Intelligence, Feature (machine learning), Question answering, Artificial intelligence, business, computer, Natural language processing, Information Systems
الوصف: As an important data source in the field of bridge management, bridge inspection reports contain large-scale fine-grained data, including information on bridge members and structural defects. However, due to insufficient research on automatic information extraction in this field, valuable bridge inspection information has not been fully utilized. Particularly, for Chinese bridge inspection entities, which involve domain-specific vocabularies and have obvious nesting characteristics, most of the existing named entity recognition (NER) solutions are not suitable. To address this problem, this paper proposes a novel lexicon augmented machine reading comprehension-based NER neural model for identifying flat and nested entities from Chinese bridge inspection text. The proposed model uses the bridge inspection text and predefined question queries as input to enhance the ability of contextual feature representation and to integrate prior knowledge. Based on the character-level features encoded by the pre-trained BERT model, bigram embeddings and weighted lexicon features are further combined into a context representation. Then, the bidirectional long short-term memory neural network is used to extract sequence features before predicting the spans of named entities. The proposed model is verified by the Chinese bridge inspection named entity corpus. The experimental results show that the proposed model outperforms other mainstream NER models on the bridge inspection corpus. The proposed model not only provides a basis for automatic bridge inspection information extraction but also supports the downstream tasks such as knowledge graph construction and question answering systems.
تدمد: 1474-0346
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::1f0ae12637a896f65bffeeb7ed728371
https://doi.org/10.1016/j.aei.2021.101416
حقوق: CLOSED
رقم الأكسشن: edsair.doi...........1f0ae12637a896f65bffeeb7ed728371
قاعدة البيانات: OpenAIRE