Complementary and Integrative Health Lexicon (CIHLex) and Entity Recognition in the Literature

التفاصيل البيبلوغرافية
العنوان: Complementary and Integrative Health Lexicon (CIHLex) and Entity Recognition in the Literature
المؤلفون: Zhou, Huixue, Austin, Robin, Lu, Sheng-Chieh, Silverman, Greg, Zhou, Yuqi, Kilicoglu, Halil, Xu, Hua, Zhang, Rui
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language
الوصف: Objective: Our study aimed to construct an exhaustive Complementary and Integrative Health (CIH) Lexicon (CIHLex) to better represent the often underrepresented physical and psychological CIH approaches in standard terminologies. We also intended to apply advanced Natural Language Processing (NLP) models such as Bidirectional Encoder Representations from Transformers (BERT) and GPT-3.5 Turbo for CIH named entity recognition, evaluating their performance against established models like MetaMap and CLAMP. Materials and Methods: We constructed the CIHLex by integrating various resources, compiling and integrating data from biomedical literature and relevant knowledge bases. The Lexicon encompasses 198 unique concepts with 1090 corresponding unique terms. We matched these concepts to the Unified Medical Language System (UMLS). Additionally, we developed and utilized BERT models and compared their efficiency in CIH named entity recognition to that of other models such as MetaMap, CLAMP, and GPT3.5-turbo. Results: From the 198 unique concepts in CIHLex, 62.1% could be matched to at least one term in the UMLS. Moreover, 75.7% of the mapped UMLS Concept Unique Identifiers (CUIs) were categorized as "Therapeutic or Preventive Procedure." Among the models applied to CIH named entity recognition, BLUEBERT delivered the highest macro average F1-score of 0.90, surpassing other models. Conclusion: Our CIHLex significantly augments representation of CIH approaches in biomedical literature. Demonstrating the utility of advanced NLP models, BERT notably excelled in CIH entity recognition. These results highlight promising strategies for enhancing standardization and recognition of CIH terminology in biomedical contexts.
Comment: need to update the data
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2305.17353
رقم الأكسشن: edsarx.2305.17353
قاعدة البيانات: arXiv