Editorial & Opinion

Perspectives on tracking data reuse across biodata resources.

التفاصيل البيبلوغرافية
العنوان: Perspectives on tracking data reuse across biodata resources.
المؤلفون: Ross KE; Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, United States., Bastian FB; Evolutionary Bioinformatics Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.; Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland., Buys M; DataCite, 30167 Hannover, Germany., Cook CE; Global Biodata Coalition, Strasbourg 67080, France., D'Eustachio P; Department of Biochemistry & Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10012, United States., Harrison M; Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom., Hermjakob H; Molecular Systems, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom., Li D; Chan Zuckerberg Initiative, Redwood City, CA 94063, United States., Lord P; School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, United Kingdom., Natale DA; Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, United States., Peters B; Center for Vaccine Innovation, La Jolla Institute of Immunology, La Jolla, CA 92037, United States., Sternberg PW; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States., Su AI; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States., Thakur M; Data Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom., Thomas PD; Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90089, United States., Bateman A; MSCB, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom.
مؤلفون مشاركون: and the UniProt Consortium
المصدر: Bioinformatics advances [Bioinform Adv] 2024 Apr 25; Vol. 4 (1), pp. vbae057. Date of Electronic Publication: 2024 Apr 25 (Print Publication: 2024).
نوع المنشور: Editorial
اللغة: English
بيانات الدورية: Publisher: Oxford University Press Country of Publication: England NLM ID: 9918282081306676 Publication Model: eCollection Cited Medium: Internet ISSN: 2635-0041 (Electronic) Linking ISSN: 26350041 NLM ISO Abbreviation: Bioinform Adv Subsets: PubMed not MEDLINE
أسماء مطبوعة: Original Publication: [Oxford] : Oxford University Press : International Society for Computational Biology, [2021]-
مستخلص: Motivation: Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge.
Results: The article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources.
Availability and Implementation: Summaries of survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).
Competing Interests: A.B. is Editor-in-Chief of Bioinformatics Advances, but was not involved in the editorial process of this manuscript.
(© The Author(s) 2024. Published by Oxford University Press.)
References: Sci Data. 2018 Nov 20;5:180259. (PMID: 30457573)
Nucleic Acids Res. 2022 Jan 7;50(D1):D1515-D1521. (PMID: 34986598)
Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531. (PMID: 36408920)
Nucleic Acids Res. 2016 Jan 4;44(D1):D20-6. (PMID: 26673705)
PLoS One. 2023 Nov 28;18(11):e0294812. (PMID: 38015968)
Sci Data. 2016 Mar 15;3:160018. (PMID: 26978244)
Database (Oxford). 2022 May 25;2022:. (PMID: 35616100)
Nat Methods. 2016 Aug 30;13(9):705-6. (PMID: 27575621)
Bioinformatics. 2017 Sep 01;33(17):2731-2736. (PMID: 28525546)
Nucleic Acids Res. 2024 Jan 5;52(D1):D672-D678. (PMID: 37941124)
Nucleic Acids Res. 2023 Jan 6;51(D1):D1-D8. (PMID: 36624667)
Nucleic Acids Res. 2017 Jan 4;45(D1):D339-D346. (PMID: 27899649)
Nucleic Acids Res. 2020 Jan 8;48(D1):D17-D23. (PMID: 31701143)
Database (Oxford). 2015 May 09;2015:bav043. (PMID: 25957950)
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338. (PMID: 30395331)
PLoS One. 2016 Apr 29;11(4):e0154556. (PMID: 27128319)
Mamm Genome. 2023 Dec;34(4):531-544. (PMID: 37666946)
Bioinformatics. 2020 Apr 15;36(8):2636-2642. (PMID: 31950984)
معلومات مُعتمدة: R35 GM141873 United States GM NIGMS NIH HHS
فهرسة مساهمة: Investigator: A Bateman; MJ Martin; S Orchard; M Magrane; S Ahmad; EH Bowler-Barnett; H Bye-A-Jee; P Denny; T Dogan; T Ebenezer; J Fan; LJ da Costa Gonzales; A Hussein; A Ignatchenko; G Insana; R Ishtiaq; V Joshi; D Jyothi; S Kandasaamy; A Lock; A Luciani; J Luo; Y Lussi; P Raposo; DL Rice; R Saidi; R Santos; E Speretta; J Stephenson; P Totoo; N Tyagi; P Vasudev; K Warner; R Zaru; S Wijerathne; KT Ibrahim; M Kim; J Marin; AJ Bridge; L Aimo; G Argoud-Puy; AH Auchincloss; KB Axelsen; P Bansal; D Baratin; TM Batista Neto; JT Bolleman; E Boutet; L Breuza; BC Gil; C Casals-Casas; E Coudert; B Cuche; E de Castro; A Estreicher; ML Famiglietti; M Feuermann; E Gasteiger; S Gehant; A Gos; N Gruaz; C Hulo; N Hyka-Nouspikel; F Jungo; A Kerhornou; P Le Mercier; D Lieberherr; P Masson; A Morgat; I Pedruzzi; S Pilbout; L Pourcel; S Poux; M Pozzato; M Pruess; N Redaschi; C Rivoire; CJA Sigrist; S Sundaram; A Sveshnikova; CH Wu; CN Arighi; C Chen; Y Chen; H Huang; K Laiho; M Lehvaslaiho; P McGarvey; DA Natale; K Ross; CR Vinayaka; Y Wang; J Zhang
تواريخ الأحداث: Date Created: 20240509 Latest Revision: 20240613
رمز التحديث: 20240613
مُعرف محوري في PubMed: PMC11076920
DOI: 10.1093/bioadv/vbae057
PMID: 38721398
قاعدة البيانات: MEDLINE
الوصف
تدمد:2635-0041
DOI:10.1093/bioadv/vbae057