دورية أكاديمية

Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in Nellore cattle.

التفاصيل البيبلوغرافية
العنوان: Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in Nellore cattle.
المؤلفون: Mota LFM; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil. flaviommota.zoo@gmail.com., Arikawa LM; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil., Santos SWB; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil., Fernandes Júnior GA; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil., Alves AAC; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil., Rosa GJM; Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI, 53706, USA., Mercadante MEZ; Institute of Animal Science, Beef Cattle Research Center, Sertãozinho, SP, 14174-000, Brazil.; National Council for Science and Technological Development, Brasilia, DF, 71605-001, Brazil., Cyrillo JNSG; Institute of Animal Science, Beef Cattle Research Center, Sertãozinho, SP, 14174-000, Brazil., Carvalheiro R; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil.; National Council for Science and Technological Development, Brasilia, DF, 71605-001, Brazil., Albuquerque LG; School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil. galvao.albuquerque@unesp.br.; National Council for Science and Technological Development, Brasilia, DF, 71605-001, Brazil. galvao.albuquerque@unesp.br.
المصدر: Scientific reports [Sci Rep] 2024 Mar 17; Vol. 14 (1), pp. 6404. Date of Electronic Publication: 2024 Mar 17.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101563288 Publication Model: Electronic Cited Medium: Internet ISSN: 2045-2322 (Electronic) Linking ISSN: 20452322 NLM ISO Abbreviation: Sci Rep Subsets: MEDLINE
أسماء مطبوعة: Original Publication: London : Nature Publishing Group, copyright 2011-
مواضيع طبية MeSH: Benchmarking* , Polymorphism, Single Nucleotide*, Cattle/genetics ; Animals ; Bayes Theorem ; Models, Genetic ; Phenotype ; Genomics/methods ; Genotype
مستخلص: Genomic selection (GS) offers a promising opportunity for selecting more efficient animals to use consumed energy for maintenance and growth functions, impacting profitability and environmental sustainability. Here, we compared the prediction accuracy of multi-layer neural network (MLNN) and support vector regression (SVR) against single-trait (STGBLUP), multi-trait genomic best linear unbiased prediction (MTGBLUP), and Bayesian regression (BayesA, BayesB, BayesC, BRR, and BLasso) for feed efficiency (FE) traits. FE-related traits were measured in 1156 Nellore cattle from an experimental breeding program genotyped for ~ 300 K markers after quality control. Prediction accuracy (Acc) was evaluated using a forward validation splitting the dataset based on birth year, considering the phenotypes adjusted for the fixed effects and covariates as pseudo-phenotypes. The MLNN and SVR approaches were trained by randomly splitting the training population into fivefold to select the best hyperparameters. The results show that the machine learning methods (MLNN and SVR) and MTGBLUP outperformed STGBLUP and the Bayesian regression approaches, increasing the Acc by approximately 8.9%, 14.6%, and 13.7% using MLNN, SVR, and MTGBLUP, respectively. Acc for SVR and MTGBLUP were slightly different, ranging from 0.62 to 0.69 and 0.62 to 0.68, respectively, with empirically unbiased for both models (0.97 and 1.09). Our results indicated that SVR and MTGBLUBP approaches were more accurate in predicting FE-related traits than Bayesian regression and STGBLUP and seemed competitive for GS of complex phenotypes with various degrees of inheritance.
(© 2024. The Author(s).)
References: Arthur, P. F., Archer, J. A. & Herd, R. M. Feed intake and efficiency in beef cattle: overview of recent Australian research and challenges for the future. Aust. J. Exp. Agric. 44, 361 (2004). (PMID: 10.1071/EA02162)
Pryce, J. E., Wales, W. J., de Haas, Y., Veerkamp, R. F. & Hayes, B. J. Genomic selection for feed efficiency in dairy cattle. Animal 8, 1–10 (2014). (PMID: 2412870410.1017/S1751731113001687)
Meuwissen, T. H. E. E., Hayes, B. J. & Goddard, M. E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001). (PMID: 11290733146158910.1093/genetics/157.4.1819)
Brito Lopes, F. et al. Improving genomic prediction accuracy for meat tenderness in Nellore cattle using artificial neural networks. J. Anim. Breed. Genet. 137, 438–448 (2020). (PMID: 3202067810.1111/jbg.12468)
Mota, L. F. M. et al. Genomic reaction norm models exploiting genotype × environment interaction on sexual precocity indicator traits in Nellore cattle. Anim. Genet. 51, 210–223 (2020). (PMID: 3194435610.1111/age.12902)
Silva, R. M. O. O. et al. Accuracies of genomic prediction of feed efficiency traits using different prediction and validation methods in an experimental Nelore cattle population. J. Anim. Sci. 94, 3613–3623 (2016). (PMID: 2789888910.2527/jas.2016-0401)
Zhang, H., Yin, L., Wang, M., Yuan, X. & Liu, X. Factors affecting the accuracy of genomic selection for agricultural economic traits in maize, cattle, and pig populations. Front. Genet. 10, 1–10 (2019).
Moser, G., Khatkar, M. S., Hayes, B. J. & Raadsma, H. W. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers. Genet. Sel. Evol. 42, 1–15 (2010). (PMID: 10.1186/1297-9686-42-37)
Goddard, M. Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–257 (2009). (PMID: 1870469610.1007/s10709-008-9308-0)
Daetwyler, H. D., Pong-Wong, R., Villanueva, B. & Woolliams, J. A. The impact of genetic architecture on genome-wide evaluation methods. Genetics 185, 1021–1031 (2010). (PMID: 20407128290718910.1534/genetics.110.116855)
Lourenco, D. A. L. et al. Methods for genomic evaluation of a relatively small genotyped dairy population and effect of genotyped cow information in multiparity analyses. J. Dairy Sci. 97, 1742–1752 (2014). (PMID: 2447212310.3168/jds.2013-6916)
Gianola, D. Priors in whole-genome regression: The bayesian alphabet returns. Genetics 194, 573–596 (2013). (PMID: 23636739369796510.1534/genetics.113.151753)
Ren, D., An, L., Li, B., Qiao, L. & Liu, W. Efficient weighting methods for genomic best linear-unbiased prediction (BLUP) adapted to the genetic architectures of quantitative traits. Heredity (Edinb). 126, 320–334 (2021). (PMID: 3298086310.1038/s41437-020-00372-y)
Pérez, P. & de los Campos, G,. Genome-wide regression and prediction with the BGLR Statistical Package. Genetics 198, 483–495 (2014). (PMID: 25009151419660710.1534/genetics.114.164442)
Momen, M. et al. Predictive ability of genome-assisted statistical models under various forms of gene action. Sci. Rep. 8, 12309 (2018). (PMID: 30120288609816410.1038/s41598-018-30089-2)
Mackay, T. F. C. Epistasis and quantitative traits: using model organisms to study gene–gene interactions. Nat. Rev. Genet. 15, 22–33 (2014). (PMID: 2429653310.1038/nrg3627)
Azodi, C. B., Tang, J. & Shiu, S. H. Opening the black box: Interpretable Machine learning for geneticists. Trends Genet. 36, 442–455 (2020). (PMID: 3239683710.1016/j.tig.2020.03.005)
Abdollahi-Arpanahi, R., Gianola, D. & Peñagaricano, F. Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes. Genet. Sel. Evol. 52, 12 (2020). (PMID: 32093611703852910.1186/s12711-020-00531-z)
Mota, L. F. M. et al. Integrating on-farm and genomic information improves the predictive ability of milk infrared prediction of blood indicators of metabolic disorders in dairy cows. Genet. Sel. Evol. 55, 23 (2023). (PMID: 370134821006910910.1186/s12711-023-00795-1)
Li, B. et al. Genomic prediction of breeding values using a subset of snps identified by three machine learning methods. Front. Genet. 9, 1–20 (2018). (PMID: 10.3389/fgene.2018.00237)
Montesinos-López, O. A. et al. A genomic bayesian multi-trait and multi-environment model. G3 (Bethesda) 6, 2725–2774 (2016). (PMID: 2734273810.1534/g3.116.032359)
Jia, Y. & Jannink, J.-L. Multiple-trait genomic selection methods increase genetic value prediction accuracy. Genetics 192, 1513–1522 (2012). (PMID: 23086217351215610.1534/genetics.112.144246)
Manzanilla-Pech, C. I. V. I. V., Gordo, D., Difford, G. F. F., Løvendahl, P. & Lassen, J. Multitrait genomic prediction of methane emissions in Danish Holstein cattle. J. Dairy Sci. 103, 9195–9206 (2020). (PMID: 3274709710.3168/jds.2019-17857)
Jiang, J. et al. Joint prediction of multiple quantitative traits using a Bayesian multivariate antedependence model. Heredity (Edinb). 115, 29–36 (2015). (PMID: 25873147481550110.1038/hdy.2015.9)
Mota, L. F. M. et al. Meta-analysis across Nellore cattle populations identifies common metabolic mechanisms that regulate feed efficiency-related traits. BMC Genomics 23, 424 (2022). (PMID: 35672696917210810.1186/s12864-022-08671-w)
Sargolzaei, M., Chesnais, J. P. & Schenkel, F. S. A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15, 1–12 (2014). (PMID: 10.1186/1471-2164-15-478)
Dray, S. & Dufour, A. B. The ade4 package: Implementing the duality diagram for ecologists. J. Stat. Softw. 22, 1–20 (2007). (PMID: 10.18637/jss.v022.i04)
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008). (PMID: 1894614710.3168/jds.2007-0980)
Misztal, I. et al. Manual for BLUPF90 family of programs. (University of Georgia, 2018).
BIF. Guidelines for Uniform beef improvement. Beef Improvement Federation (Athens, GA: Beef Improvement Federation, 2002).
Park, T. & Casella, G. The bayesian lasso. J. Am. Stat. Assoc. 103, 681–686 (2008). (PMID: 10.1198/016214508000000337)
Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996).
Meuwissen, T. H. E., Hayes, B. J. & Goddard, M. E. Prediction of total genetic value using genome wide dense marker map. Genetics 157, 1819–1829 (2001). (PMID: 11290733146158910.1093/genetics/157.4.1819)
Habier, D., Fernando, R. L., Kizilkaya, K. & Garrick, D. J. Extension of the bayesian alphabet for genomic selection. BMC Bioinform. 12, 1–12 (2011). (PMID: 10.1186/1471-2105-12-186)
Montesinos-López, O. A. et al. A review of deep learning applications for genomic selection. BMC Genomics 22, 1–23 (2021). (PMID: 10.1186/s12864-020-07319-x)
Pérez-Enciso, M. & Zingaretti, L. M. A guide for using deep learning for complex trait genomic prediction. Genes 10, 1–19 (2019). (PMID: 10.3390/genes10070553)
Zhang, Z. et al. Improving the accuracy of whole genome prediction for complex traits using the results of genome wide association studies. PLoS One 9, e93017 (2014). (PMID: 24663104396396110.1371/journal.pone.0093017)
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning. Springer Series in Statistics (Springer, 2009). doi: https://doi.org/10.1007/978-0-387-84858-7 .
MacKay, D. J. C. Information Theory (University of Cambridge, UK, 2003).
Pérez-Rodríguez, P., Gianola, D., Weigel, K. A., Rosa, G. J. M. & Crossa, J. Technical note: An R package for fitting Bayesian regularized neural networks with applications in animal breeding. J. Anim. Sci. 91, 3522–3531 (2013). (PMID: 2365832710.2527/jas.2012-6162)
Eraslan, G., Avsec, Ž, Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019). (PMID: 3097180610.1038/s41576-019-0122-6)
Mota, L. F. M. et al. Evaluating the performance of machine learning methods and variable selection methods for predicting difficult-to-measure traits in Holstein dairy cattle using milk infrared spectral data. J. Dairy Sci. 104, 8107–8121 (2021). (PMID: 3386558910.3168/jds.2020-19861)
Cortes, C. & Vapnik, V. Support-Vector Networks. Machine Learning vol. 20 273–297 (Springer, 1995).
Long, N., Gianola, D., Rosa, G. J. M. & Weigel, K. A. Application of support vector regression to genome-assisted prediction of quantitative traits. Theor. Appl. Genet. 123, 1065–1074 (2011). (PMID: 2173913710.1007/s00122-011-1648-y)
Evgeniou, T. & Pontil, M. Support Vector Machines: Theory and Applications Vol. 177 (Springer, Berlin Heidelberg, 2005).
Vapnik, V. N. The Nature of Statistical Learning Theory (Springer, New York, 2000). (PMID: 10.1007/978-1-4757-3264-1)
Cherkassky, V. & Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17, 113–126 (2004). (PMID: 1469071210.1016/S0893-6080(03)00169-2)
Meyer, D. et al. e1071: Misc Functions of the Department of Statistics, Probability Theory Group. in 1–63 (2020).
Dunn, O. J. & Clark, V. Comparison of tests of the equality of dependent correlation coefficients. J. Am. Stat. Assoc. 66, 904–908 (1971). (PMID: 10.1080/01621459.1971.10482369)
Karaman, E., Lund, M. S. & Su, G. Multi-trait single-step genomic prediction accounting for heterogeneous (co)variances over the genome. Heredity (Edinb). 124, 274–287 (2020). (PMID: 3164123710.1038/s41437-019-0273-4)
Montesinos-López, O. A. et al. A benchmarking between deep learning, support vector machine and bayesian threshold best linear unbiased prediction for predicting ordinal traits in plant breeding. G3 (Bethesda) 9, 601–618 (2019). (PMID: 3059351210.1534/g3.118.200998)
Liang, M. et al. A stacking ensemble learning framework for genomic prediction. Front. Genet. 12, 79 (2021). (PMID: 10.3389/fgene.2021.600040)
Huang, W. & Mackay, T. F. C. The genetic architecture of quantitative traits cannot be inferred from variance component analysis. PLOS Genet. 12, e1006421 (2016). (PMID: 27812106509475010.1371/journal.pgen.1006421)
Chen, L. et al. Accuracy of predicting genomic breeding values for residual feed intake in angus and charolais beef cattle. J. Anim. Sci 91, 4669–4678 (2013). (PMID: 2407861810.2527/jas.2013-5715)
Lu, D. et al. Accuracy of genomic predictions for feed efficiency traits of beef cattle using 50K and imputed HD genotypes. J. Anim. Sci. 94, 1342–1353 (2016). (PMID: 2713599410.2527/jas.2015-0126)
Lee, S. H., Clark, S. & van der Werf, J. H. J. Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship. PLoS One 12, 1–22 (2017). (PMID: 10.1371/journal.pone.0189775)
Pryce, J. E. et al. Accuracy of genomic predictions of residual feed intake and 250-day body weight in growing heifers using 625,000 single nucleotide polymorphism markers. J. Dairy Sci. 95, 2108–2119 (2012). (PMID: 2245985610.3168/jds.2011-4628)
Howard, R., Carriquiry, A. L. & Beavis, W. D. Parametric and nonparametric statistical methods for genomic selection of traits with additive and epistatic genetic architectures. G3 (Bethesda) 4, 1027–1046 (2014). (PMID: 2472728910.1534/g3.114.010298)
Koumakis, L. Deep learning models in genomics; are we there yet?. Comput. Struct. Biotechnol. J. 18, 1466–1473. https://doi.org/10.1016/j.csbj.2020.06.017 (2020). (PMID: 10.1016/j.csbj.2020.06.017326370447327302)
Montesinos-López, A. et al. A guide for kernel generalized regression methods for genomic-enabled prediction. Heredity (Edinb). 126, 577–596 (2021). (PMID: 33649571811567810.1038/s41437-021-00412-1)
Long, N. et al. Radial basis function regression methods for predicting quantitative traits using SNP markers. Genet. Res. (Camb) 92, 209–225 (2010). (PMID: 2066716510.1017/S0016672310000157)
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015). (PMID: 2601744210.1038/nature14539)
Brunes, L. C. et al. Weighted single-step genome-wide association study and pathway analyses for feed efficiency traits in Nellore cattle. J. Anim. Breed. Genet. 138, 23–44 (2021). (PMID: 3265437310.1111/jbg.12496)
Olivieri, B. F. et al. Genomic regions associated with feed efficiency indicator traits in an experimental nellore cattle population. PLoS One 11, 1–19 (2016). (PMID: 10.1371/journal.pone.0164390)
Hayes, B. J., Pryce, J., Chamberlain, A. J., Bowman, P. J. & Goddard, M. E. Genetic architecture of complex traits and accuracy of genomic prediction: Coat colour, milk-fat percentage, and type in holstein cattle as contrasting model traits. PLoS Genet 6, 1–11 (2010). (PMID: 10.1371/journal.pgen.1001139)
Clark, S. A., Hickey, J. M. & Van Der Werf, J. H. Different models of genetic variation and their effect on genomic evaluation. Genet. Sel. Evol. 43, 1–9 (2011). (PMID: 10.1186/1297-9686-43-18)
Hayes, B. J., Bowman, P. J., Chamberlain, A. C., Verbyla, K. & Goddard, M. E. Accuracy of genomic breeding values in multi-breed dairy cattle populations. Genet. Sel. Evol. 41, 51 (2009). (PMID: 19930712279175010.1186/1297-9686-41-51)
Baker, L. A. et al. Bayesian and machine learning models for genomic prediction of anterior cruciate ligament rupture in the canine model. G3 10(10), 2619–2628 (2020). (PMID: 32499222740745010.1534/g3.120.401244)
Rius-Vilarrasa, E. et al. Influence of model specifications on the reliabilities of genomic prediction in a Swedish-Finnish red breed cattle population. J. Anim. Breed. Genet. 129, 369–379 (2012). (PMID: 2296335810.1111/j.1439-0388.2012.00989.x)
Morgante, F., Huang, W., Maltecca, C. & Mackay, T. F. C. Effect of genetic architecture on the prediction accuracy of quantitative traits in samples of unrelated individuals. Heredity (Edinb). 120, 500–514 (2018). (PMID: 29426878594328710.1038/s41437-017-0043-0)
Gianola, D., Okut, H., Weigel, K. A. & Rosa, G. J. M. Predicting complex quantitative traits with Bayesian neural networks: A case study with Jersey cows and wheat. BMC Genet. 12, 1–14 (2011). (PMID: 10.1186/1471-2156-12-87)
González-Recio, O., Rosa, G. J. M. & Gianola, D. Machine learning methods and predictive ability metrics for genome-wide prediction of complex traits. Livest. Sci. 166, 217–231 (2014). (PMID: 10.1016/j.livsci.2014.05.036)
Libbrecht, M. W. & Noble, W. S. Machine learning applications in genetics and genomics. Nat. Rev. Genet. 16, 321–332 (2015). (PMID: 25948244520430210.1038/nrg3920)
Yao, C., Zhu, X. & Weigel, K. A. Semi-supervised learning for genomic prediction of novel traits with small reference populations: an application to residual feed intake in dairy cattle. Genet. Sel. Evol. 48, 1–9 (2016). (PMID: 10.1186/s12711-016-0262-5)
Arojju, S. K. et al. Multi-trait genomic prediction improves predictive ability for dry matter yield and water-soluble carbohydrates in perennial ryegrass. Front. Plant Sci. 11, 1 (2020). (PMID: 10.3389/fpls.2020.01197)
Cheng, H., Kizilkaya, K., Zeng, J., Garrick, D. & Fernando, R. Genomic prediction from multiple-trait Bayesian regression methods using mixture priors. Genetics 209, 89–103 (2018). (PMID: 29514861593717110.1534/genetics.118.300650)
Okut, H., Gianola, D., Rosa, G. J. M. M. & Weigel, K. A. Prediction of body mass index in mice using dense molecular markers and a regularized neural network. Genet. Res. (Camb) 93, 189–201 (2011). (PMID: 2148129210.1017/S0016672310000662)
Sinecen, M. Comparison of genomic best linear unbiased prediction and bayesian regularization neural networks for genomic selection. IEEE Access 7, 79199–79210 (2019). (PMID: 10.1109/ACCESS.2019.2922006)
Ehret, A., Hochstuhl, D., Gianola, D. & Thaller, G. Application of neural networks with back-propagation to genome-enabled prediction of complex traits in Holstein-Friesian and German Fleckvieh cattle. Genet. Sel. Evol. 47, 22 (2015). (PMID: 25886037437971910.1186/s12711-015-0097-5)
Bellot, P., de los Campos, G. & Pérez-Enciso, M. Can deep learning improve genomic prediction of complex human traits. Genetics 210, 809–819 (2018). (PMID: 30171033621823610.1534/genetics.118.301298)
Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013). (PMID: 23752797410420210.1038/nrg3461)
Hayashi, T. & Iwata, H. A Bayesian method and its variational approximation for prediction of genomic breeding values in multiple traits. BMC Bioinform. 14, 34 (2013). (PMID: 10.1186/1471-2105-14-34)
Ismael, A., Løvendahl, P., Fogh, A., Lund, M. S. & Su, G. Improving genetic evaluation using a multitrait single-step genomic model for ability to resume cycling after calving, measured by activity tags in Holstein cows. J. Dairy Sci. 100, 8188–8196 (2017). (PMID: 2878011010.3168/jds.2017-13122)
Karaman, E., Lund, M. S., Anche, M. T., Janss, L. & Su, G. Genomic prediction using multi-trait weighted GBLUP accounting for heterogeneous variances and covariances across the genome. G3 8, 3549–3558 (2018). (PMID: 30194089622258910.1534/g3.118.200673)
Calus, M. P. L. & Veerkamp, R. F. Accuracy of multi-trait genomic selection using different methods. Genet. Sel. Evol. 43, 1–14 (2011). (PMID: 10.1186/1297-9686-43-26)
تواريخ الأحداث: Date Created: 20240317 Date Completed: 20240318 Latest Revision: 20240321
رمز التحديث: 20240321
مُعرف محوري في PubMed: PMC10944497
DOI: 10.1038/s41598-024-57234-4
PMID: 38493207
قاعدة البيانات: MEDLINE
الوصف
تدمد:2045-2322
DOI:10.1038/s41598-024-57234-4