دورية أكاديمية

A reproducible ensemble machine learning approach to forecast dengue outbreaks.

التفاصيل البيبلوغرافية
العنوان: A reproducible ensemble machine learning approach to forecast dengue outbreaks.
المؤلفون: Sebastianelli A; Engineering Department, University of Sannio, Benevento, Italy. alessandro.sebastianelli@esa.int.; European Space Agency, Φ-lab, Frascati, Italy. alessandro.sebastianelli@esa.int., Spiller D; School of Aerospace Engineering, Sapienza University of Rome, Rome, Italy., Carmo R; European Space Agency, Φ-lab, Frascati, Italy., Wheeler J; European Space Agency, Φ-lab, Frascati, Italy., Nowakowski A; Faculty of Geodesy and Cartography, Warsaw University of Technology, Warsaw, Poland., Jacobson LV; Statistics Department, Fluminense Federal University, Niterói, Brazil., Kim D; UNICEF, New York, NY, USA., Barlevi H; UNICEF, New York, NY, USA., Cordero ZER; UNICEF, New York, NY, USA., Colón-González FJ; Wellcome Trust, Data for Science and Health, London, UK.; Centre on Climate Change and Planetary Health and Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK.; Tyndall Centre for Climate Change Research, School of Environmental Sciences, University of East Anglia, Norwich, UK., Lowe R; Centre on Climate Change and Planetary Health and Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK.; Barcelona Supercomputing Center (BSC), Barcelona, Spain.; Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain., Ullo SL; Engineering Department, University of Sannio, Benevento, Italy., Schneider R; European Space Agency, Φ-lab, Frascati, Italy. rochelle.schneider@esa.int.
المصدر: Scientific reports [Sci Rep] 2024 Feb 15; Vol. 14 (1), pp. 3807. Date of Electronic Publication: 2024 Feb 15.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101563288 Publication Model: Electronic Cited Medium: Internet ISSN: 2045-2322 (Electronic) Linking ISSN: 20452322 NLM ISO Abbreviation: Sci Rep Subsets: MEDLINE
أسماء مطبوعة: Original Publication: London : Nature Publishing Group, copyright 2011-
مواضيع طبية MeSH: Dengue*/epidemiology, Humans ; Young Adult ; Adult ; Disease Outbreaks/prevention & control ; Public Health/methods ; Climate ; Machine Learning
مستخلص: Dengue fever, a prevalent and rapidly spreading arboviral disease, poses substantial public health and economic challenges in tropical and sub-tropical regions worldwide. Predicting infectious disease outbreaks on a countrywide scale is complex due to spatiotemporal variations in dengue incidence across administrative areas. To address this, we propose a machine learning ensemble model for forecasting the dengue incidence rate (DIR) in Brazil, with a focus on the population under 19 years old. The model integrates spatial and temporal information, providing one-month-ahead DIR estimates at the state level. Comparative analyses with a dummy model and ablation studies demonstrate the ensemble model's qualitative and quantitative efficacy across the 27 Brazilian Federal Units. Furthermore, we showcase the transferability of this approach to Peru, another Latin American country with differing epidemiological characteristics. This timely forecast system can aid local governments in implementing targeted control measures. The study advances climate services for health by identifying factors triggering dengue outbreaks in Brazil and Peru, emphasizing collaborative efforts with intergovernmental organizations and public health institutions. The innovation lies not only in the algorithms themselves but in their application to a domain marked by data scarcity and operational scalability challenges. We bridge the gap by integrating well-curated ground data with advanced analytical methods, addressing a significant deficiency in current practices. The successful transfer of the model to Peru and its consistent performance during the 2019 outbreak in Brazil showcase its scalability and practical application. While acknowledging limitations in handling extreme values, especially in regions with low DIR, our approach excels where accurate predictions are critical. The study not only contributes to advancing DIR forecasting but also represents a paradigm shift in integrating advanced analytics into public health operational frameworks. This work, driven by a collaborative spirit involving intergovernmental organizations and public health institutions, sets a precedent for interdisciplinary collaboration in addressing global health challenges. It not only enhances our understanding of factors triggering dengue outbreaks but also serves as a template for the effective implementation of advanced analytical methods in public health.
(© 2024. The Author(s).)
References: Buczak, A. L. et al. Ensemble method for dengue prediction. PLoS ONE 13, e0189988 (2018). (PMID: 10.1371/journal.pone.0189988292983205752022)
Messina, J. P. et al. The current and future global distribution and population at risk of dengue. Nat. Microbiol. 4, 1508–1515 (2019). (PMID: 10.1038/s41564-019-0476-8311828016784886)
Pinheiro, F. P. & Corber, S. J. Global situation of dengue and dengue haemorrhagic fever, and its emergence in the Americas. World health statistics quarterly. Rapport trimestriel de statistiques sanitaires mondiales 50, 161–169 (1997).
Hammond, S. N. et al. Differences in dengue severity in infants, children, and adults in a 3-year hospital-based study in Nicaragua. Am. J. Trop. Med. Hyg. 73, 1063–1070 (2005). (PMID: 10.4269/ajtmh.2005.73.106316354813)
Hales, S. & van Panhuis, W. A new strategy for dengue control. Lancet 365, 551–551 (2005). (PMID: 10.1016/S0140-6736(05)70772-815708083)
Wen, T.-H., Lin, M.-H., Teng, H.-J. & Chang, N.-T. Incorporating the human-aedes mosquito interactions into measuring the spatial risk of urban dengue fever. Appl. Geogr. 62, 256–266 (2015). (PMID: 10.1016/j.apgeog.2015.05.003)
Colón-González, F. J. et al. Projecting the risk of mosquito-borne diseases in a warmer and more populated world: a multi-model, multi-scenario intercomparison modelling study. Lancet Planetary Health5, e404–e414. https://doi.org/10.1016/s2542-5196(21)00132-7 (2021).
Gubler, D. J. Dengue, urbanization and globalization: the unholy trinity of the 21st century. Trop. Med. Health 39, S3–S11 (2011). (PMID: 10.2149/tmh.2011-S05)
Lowe, R. et al. Spatio-temporal modelling of climate-sensitive disease risk: towards an early warning system for dengue in Brazil. Comput. Geosci. 37, 371–381 (2011). (PMID: 10.1016/j.cageo.2010.01.008)
Fitzpatrick, C. & Engels, D. Leaving no one behind: a neglected tropical disease indicator and tracers for the sustainable development goals. Int. Health 8, i15–i18 (2016). (PMID: 10.1093/inthealth/ihw002269403044777229)
Yboa, B. C. & Labrague, L. J. Dengue knowledge and preventive practices among rural residents in Samar province, Philippines. Am. J. Public Health Res. 1, 47–52 (2013). (PMID: 10.12691/ajphr-1-2-2)
Innocenti, UNICEF. Best of UNICEF Research 2022, Miscellanea. UNICEF Innocenti - Global Office of Research and Foresight, Florence, Italy (2022). ISBN: 978-88-652-2068-9.
United Nations Children’s Fund (UNICEF). The Climate Crisis is a Child Rights Crisis: Introducing the Children’s Climate Risk Index. New York, US (2021). ISBN: 978-92-806-5276-5.
Luz, P. M., Mendes, B. V. M., Codeço, C. T., Struchiner, C. J. & Galvani, A. P. Time series analysis of dengue incidence in Rio de Janeiro, Brazil. Am. J. Trop. Med. Hyg. 79, 933–939 (2008). (PMID: 10.4269/ajtmh.2008.79.93319052308)
Lima, M. V. M. d. & Laporta, G. Z. Evaluation of the models for forecasting dengue in Brazil from 2000 to 2017: An ecological time-series study. Insects, 11, 794 (2020).
Stolerman, L. M., Maia, P. D. & Kutz, J. N. Forecasting dengue fever in Brazil: an assessment of climate conditions. PLoS ONE 14, e0220106 (2019). (PMID: 10.1371/journal.pone.0220106313939086687106)
Souza, C., Maia, P., Stolerman, L. M., Rolla, V. & Velho, L. Predicting dengue outbreaks in brazil with manifold learning on climate data. Expert Syst. Appl. 192, 116324 (2022). (PMID: 10.1016/j.eswa.2021.116324)
McGough, S. F., Clemente, L., Kutz, J. N. & Santillana, M. A dynamic, ensemble learning approach to forecast dengue fever epidemic years in brazil using weather and population susceptibility cycles. J. R. Soc. Interface 18, 20201006 (2021). (PMID: 10.1098/rsif.2020.1006341297858205538)
Siregar, F. & Makmur, T. Time series analysis of dengue hemorrhagic fever cases and climate: a model for dengue prediction. J. Phys.: Conf. Ser., vol. 1235, 012072 (IOP Publishing, 2019).
Baquero, O. S., Santana, L. M. R. & Chiaravalloti-Neto, F. Dengue forecasting in são paulo city with generalized additive models, artificial neural networks and seasonal autoregressive integrated moving average models. PLoS ONE 13, e0195065 (2018). (PMID: 10.1371/journal.pone.0195065296085865880372)
Buczak, A. L., Koshute, P. T., Babin, S. M., Feighner, B. H. & Lewis, S. H. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med. Inform. Decis. Mak. 12, 1–20 (2012). (PMID: 10.1186/1472-6947-12-124)
Benedum, C. M., Shea, K. M., Jenkins, H. E., Kim, L. Y. & Markuzon, N. Weekly dengue forecasts in iquitos, peru; san juan, puerto rico; and singapore. PLoS Negl. Trop. Dis. 14, e0008710 (2020). (PMID: 10.1371/journal.pntd.0008710330647707567393)
Deb, S., Acebedo, C. M. L., Dhanapal, G. & Heng, C. M. C. An ensemble prediction approach to weekly dengue cases forecasting based on climatic and terrain conditions. J. Health Soc. Sci. 2, 257–272 (2017).
Colón-González, F. J. et al. Probabilistic seasonal dengue forecasting in vietnam: A modelling study using superensembles. PLOS Med.18, e1003542, https://doi.org/10.1371/journal.pmed.1003542 (2021).
Bavia, L. et al. Epidemiological study on dengue in southern Brazil under the perspective of climate and poverty. Sci. Rep. 10, 1–16 (2020). (PMID: 10.1038/s41598-020-58542-1)
Cianci, D., Hartemink, N. & Ibáñez-Justicia, A. Modelling the potential spatial distribution of mosquito species using three different techniques. Int. J. Health Geogr. 14, 1–10 (2015). (PMID: 10.1186/s12942-015-0001-0)
Althouse, B. M., Ng, Y. Y. & Cummings, D. A. Prediction of dengue incidence using search query surveillance. PLoS Negl. Trop. Dis. 5, e1258 (2011). (PMID: 10.1371/journal.pntd.0001258218297443149016)
Espina, K. & Estuar, M. R. J. E. Infodemiology for syndromic surveillance of dengue and typhoid fever in the Philippines. Procedia Comput. Sci. 121, 554–561 (2017). (PMID: 10.1016/j.procs.2017.11.073)
Sani, A. et al. Bayesian temporal, spatial and spatio-temporal models of dengue in a small area with inla. Int. J. Model. Simul., 1–13 (2022).
Chou-Chen, S.-W. et al. Bayesian spatio-temporal model with inla for dengue fever risk prediction in costa rica. arXiv preprint arXiv:2302.06747 (2023).
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning, vol. 112 (Springer, 2013).
Kornblith, S., Chen, T., Lee, H. & Norouzi, M. Why do better loss functions lead to less transferable features? Adv. Neural Inf. Process. Syst.34 (2021).
Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. Classification and Regression Trees (Routledge, 2017).
Ibragimov, B. & Gusev, G. Minimal variance sampling in stochastic gradient boosting. Advances in Neural Information Processing Systems32 (2019).
Huang, G. et al. Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J. Hydrol.574, 1029–1041, https://doi.org/10.1016/j.jhydrol.2019.04.085 (2019).
Jabeur, S. B., Gharib, C., Mefteh-Wali, S. & Arfi, W. B. CatBoost model and artificial intelligence techniques for corporate failure prediction. Technol. Forecast. Soc. Change . 166, 120658, https://doi.org/10.1016/j.techfore.2021.120658 (2021).
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V. & Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, 6639-6649, https://doi.org/10.5555/3327757.3327770 (Curran Associates Inc, 2018).
Dorogush, A. V., Ershov, V. & Gulin, A. CatBoost: Gradient boosting with categorical features support. In Proceedings of the Workshop on ML Systems at NIPS 2017, NIPS 2017 (2017).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat.29, 1189 – 1232, https://doi.org/10.1214/aos/1013203451 (2001).
Vapnik, V. N. The Nature of Statistical Learning Theory (Springer, 1995).
Awad, M. & Khanna, R. Support Vector Regression, 67–80 (Apress, 2015).
Hüsken, M. & Stagge, P. Recurrent neural networks for time series classification. Neurocomputing50, 223–235, https://doi.org/10.1016/S0925-2312(01)00706-8 (2003).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997). (PMID: 10.1162/neco.1997.9.8.17359377276)
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001). (PMID: 10.1023/A:1010933404324)
Meinshausen, N. Quantile regression forests. J. Mach. Learn. Res. 7, 983–999 (2006).
Sistema de Informação de Agravos de Notificação. Accessed on 09 Feb 2022.
Instituto Brasileiro de Geografia e Estatística. Accessed on 09 Feb 2022.
Muñoz Sabater, J. et al. ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021 (2021).
U.S. Geological Survey (USGS) and the National Aeronautics and Space Administration (NASA) Land Processes Distributed Active Archive Center (LP DAAC). MODIS/Terra Surface Reflectance Daily L2G Global 1 km and 500 m. Accessed on 16 Feb 2022.
Jarvis, A., Guevara, E., Reuter, H. & Nelson, A. Hole-filled srtm for the globe: version 4: Data grid (2008). Published by CGIAR-CSI on 19 August 2008.
University of Maryland Global Forest Change 2000–2020. Accessed on 16 Feb 2022.
GitHub repository for “A reproducible ensemble machine learning approach to forecast dengue outbreaks”. https://github.com/ESA-PhiLab/ESA-UNICEF_DengueForecastProject . Accessed on 9 June 2022.
Hansen, M. et al. High-resolution global maps of 21st-century forest cover change. Science 342, 850–853. https://doi.org/10.1126/science.1244693 (2013).
Gorelick, N. et al. Google earth engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ.202, 18–27, https://doi.org/10.1016/j.rse.2017.06.031 (2017).
Lowe, R. et al. Combined effects of hydrometeorological hazards and urbanisation on dengue risk in brazil: A spatiotemporal modelling study. Lancet Planetary Health 5, e209–e219 (2021). (PMID: 10.1016/S2542-5196(20)30292-833838736)
Lowe, R. et al. Dengue outlook for the world cup in brazil: An early warning model framework driven by real-time seasonal climate forecasts. Lancet. Infect. Dis 14, 619–626 (2014). (PMID: 10.1016/S1473-3099(14)70781-924841859)
Singh, D. & Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020). (PMID: 10.1016/j.asoc.2019.105524)
Atluri, G., Karpatne, A. & Kumar, V. Spatio-temporal data mining: A survey of problems and methods. ACM Comput. Surv.51, https://doi.org/10.1145/3161602 (2018).
Quinn, J., McEachen, J., Fullan, M., Gardner, M. & Drummy, M. Dive into deep learning: Tools for engagement (Corwin Press, 2019).
معلومات مُعتمدة: United Kingdom WT_ Wellcome Trust
تواريخ الأحداث: Date Created: 20240215 Date Completed: 20240219 Latest Revision: 20240220
رمز التحديث: 20240220
مُعرف محوري في PubMed: PMC10869339
DOI: 10.1038/s41598-024-52796-9
PMID: 38360915
قاعدة البيانات: MEDLINE
الوصف
تدمد:2045-2322
DOI:10.1038/s41598-024-52796-9