ProAll-D: protein allergen detection using long short term memory - a deep learning approach

التفاصيل البيبلوغرافية
العنوان: ProAll-D: protein allergen detection using long short term memory - a deep learning approach
المؤلفون: Shanthappa, Pallavi M, Kumar, Rakshitha
المصدر: ADMET and DMPK
Volume 10
Issue 3
بيانات النشر: International Association of Physical Chemists, 2022.
سنة النشر: 2022
مصطلحات موضوعية: Allergen prediction, ACC transformation, LSTM model, Gaussian naive bayes, Classifier, Extra tree classifier, Bagging classifier, ADA boost, Linear discriminant analysis, Quadratic discriminant analysis
الوصف: Background: An allergic reaction is the immune system's overreacting to a previously encountered, typically benign molecule, frequently a protein. Allergy reactions can result in rashes, itching, mucous membrane swelling, asthma, coughing, and other bizarre symptoms. To anticipate allergies, a wide range of principles and methods have been applied in bioinformatics. The sequence similarity approach's positive predictive value is very low and ineffective for methods based on FAO/WHO criteria, making it difficult to predict possible allergens. Method: This work advocated the use of a deep learning model LSTM (Long Short-Term Memory) to overcome the limitations of traditional approaches and machine learning lower performance models in predicting the allergenicity of dietary proteins. A total of 2,427 allergens and 2,427 non-allergens, from a variety of sources, including the Central Science Laboratory and the NCBI are used. The data was divided 80:20 for training and testing purposes. These techniques have all been implemented in Python. To describe the protein sequences of allergens and non-allergens, five E-descriptors were used. E1 (hydrophilic character of peptides), E2 (length), E3(propensity to form helices), E4(abundance and dispersion), and E5 (propensity of beta strands) are used to make the variable-length protein sequence to uniform length using ACC transformation. A total of eight machine learning techniques have been taken into consideration. Results: The Gaussian Naive Bayes as accuracy of 64.14 %, Radius Neighbour's Classifier with 49.2 %, Bagging Classifier was 85.8 %, ADA Boost was 76.9 %, Linear Discriminant Analysis has 76.13 %, Quadratic Discriminant Analysis was 84.2 %, Extra Tree Classifier was 90%, and LSTM is 91.5 %. Conclusion: As the LSTM, has an AUC value of 91.5 % is regarded best in predicting allergens. A web server called ProAll-D has been created that successfully identifies novel allergens using the LSTM approach. Users can use the link https://doi.org/10.17632/tjmt97xpjf.1 to access the ProAll-D server and data.
وصف الملف: application/pdf
اللغة: English
تدمد: 1848-7718
URL الوصول: https://explore.openaire.eu/search/publication?articleId=od_______951::b29fbf093b8175ad889c17917229cb14
https://hrcak.srce.hr/file/408710
حقوق: OPEN
رقم الأكسشن: edsair.od.......951..b29fbf093b8175ad889c17917229cb14
قاعدة البيانات: OpenAIRE