دورية أكاديمية

Predicting the pathogenicity of bacterial genomes using widely spread protein families

التفاصيل البيبلوغرافية
العنوان: Predicting the pathogenicity of bacterial genomes using widely spread protein families
المؤلفون: Shaked Naor-Hoffmann, Dina Svetlitsky, Neta Sal-Man, Yaron Orenstein, Michal Ziv-Ukelson
المصدر: BMC Bioinformatics, Vol 23, Iss 1, Pp 1-18 (2022)
بيانات النشر: BMC, 2022.
سنة النشر: 2022
المجموعة: LCC:Computer applications to medicine. Medical informatics
LCC:Biology (General)
مصطلحات موضوعية: Comparative genomics, Pathogenic bacteria, Commensal bacteria, Opportunistic bacteria, Random forest, Protein families, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
الوصف: Abstract Background The human body is inhabited by a diverse community of commensal non-pathogenic bacteria, many of which are essential for our health. By contrast, pathogenic bacteria have the ability to invade their hosts and cause a disease. Characterizing the differences between pathogenic and commensal non-pathogenic bacteria is important for the detection of emerging pathogens and for the development of new treatments. Previous methods for classification of bacteria as pathogenic or non-pathogenic used either raw genomic reads or protein families as features. Using protein families instead of reads provided a better interpretability of the resulting model. However, the accuracy of protein-families-based classifiers can still be improved. Results We developed a wide scope pathogenicity classifier (WSPC), a new protein-content-based machine-learning classification model. We trained WSPC on a newly curated dataset of 641 bacterial genomes, where each genome belongs to a different species. A comparative analysis we conducted shows that WSPC outperforms existing models on two benchmark test sets. We observed that the most discriminative protein-family features in WSPC are widely spread among bacterial species. These features correspond to proteins that are involved in the ability of bacteria to survive and replicate during an infection, rather than proteins that are directly involved in damaging or invading the host.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1471-2105
Relation: https://doaj.org/toc/1471-2105
DOI: 10.1186/s12859-022-04777-w
URL الوصول: https://doaj.org/article/0cee292138a84f53abcb7b73616a3c85
رقم الأكسشن: edsdoj.0cee292138a84f53abcb7b73616a3c85
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:14712105
DOI:10.1186/s12859-022-04777-w