دورية أكاديمية

Text mining of CHO bioprocess bibliome: Topic modeling and document classification.

التفاصيل البيبلوغرافية
العنوان: Text mining of CHO bioprocess bibliome: Topic modeling and document classification.
المؤلفون: Qinghua Wang, Jonathan Olshin, K Vijay-Shanker, Cathy H Wu
المصدر: PLoS ONE, Vol 18, Iss 4, p e0274042 (2023)
بيانات النشر: Public Library of Science (PLoS), 2023.
سنة النشر: 2023
المجموعة: LCC:Medicine
LCC:Science
مصطلحات موضوعية: Medicine, Science
الوصف: Chinese hamster ovary (CHO) cells are widely used for mass production of therapeutic proteins in the pharmaceutical industry. With the growing need in optimizing the performance of producer CHO cell lines, research on CHO cell line development and bioprocess continues to increase in recent decades. Bibliographic mapping and classification of relevant research studies will be essential for identifying research gaps and trends in literature. To qualitatively and quantitatively understand the CHO literature, we have conducted topic modeling using a CHO bioprocess bibliome manually compiled in 2016, and compared the topics uncovered by the Latent Dirichlet Allocation (LDA) models with the human labels of the CHO bibliome. The results show a significant overlap between the manually selected categories and computationally generated topics, and reveal the machine-generated topic-specific characteristics. To identify relevant CHO bioprocessing papers from new scientific literature, we have developed supervized models using Logistic Regression to identify specific article topics and evaluated the results using three CHO bibliome datasets, Bioprocessing set, Glycosylation set, and Phenotype set. The use of top terms as features supports the explainability of document classification results to yield insights on new CHO bioprocessing papers.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1932-6203
Relation: https://doaj.org/toc/1932-6203
DOI: 10.1371/journal.pone.0274042
URL الوصول: https://doaj.org/article/170943bff7464f3d9ed014cb9ef61b47
رقم الأكسشن: edsdoj.170943bff7464f3d9ed014cb9ef61b47
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:19326203
DOI:10.1371/journal.pone.0274042